How to find and remove duplicate data from xml file

  • Thread starter Thread starter Sudip_inn
  • Start date Start date
S

Sudip_inn

Guest
My xml looks like

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<TickerBrokerDateMap>
<Broker>
<TickerBrokerDateFormatMap BrokerTab_Id="3" Broker_Id="2" Ticker_Id="MGP">
<StandardDate>1Q 2019A</StandardDate>
<ColumnCoordinate>BZ</ColumnCoordinate>
<TickerBrokerDateFormatMaps_Id>8</TickerBrokerDateFormatMaps_Id>
<BrokerDate StandardDate="1Q 2019A" Broker_Id="2" BrokerTab_Id="3">
<year>1Q19</year>
<Quater>1Q19</Quater>
</BrokerDate>
</TickerBrokerDateFormatMap>
<TickerBrokerDateFormatMap BrokerTab_Id="3" Broker_Id="2" Ticker_Id="MGP">
<StandardDate>2Q 2019A</StandardDate>
<ColumnCoordinate>CA</ColumnCoordinate>
<TickerBrokerDateFormatMaps_Id>8</TickerBrokerDateFormatMaps_Id>
<BrokerDate StandardDate="2Q 2019A" Broker_Id="2" BrokerTab_Id="3">
<year>2Q19</year>
<Quater>2Q19</Quater>
</BrokerDate>
</TickerBrokerDateFormatMap>
<TickerBrokerDateFormatMap BrokerTab_Id="3" Broker_Id="2" Ticker_Id="MGP">
<StandardDate>3Q 2019A</StandardDate>
<ColumnCoordinate>CB</ColumnCoordinate>
<TickerBrokerDateFormatMaps_Id>8</TickerBrokerDateFormatMaps_Id>
<BrokerDate StandardDate="3Q 2019A" Broker_Id="2" BrokerTab_Id="3">
<year>3Q19</year>
<Quater>3Q19</Quater>
</BrokerDate>
</TickerBrokerDateFormatMap>
<TickerBrokerDateFormatMap BrokerTab_Id="3" Broker_Id="2" Ticker_Id="MGP">
<StandardDate>4Q 2019A</StandardDate>
<ColumnCoordinate>CC</ColumnCoordinate>
<TickerBrokerDateFormatMaps_Id>8</TickerBrokerDateFormatMaps_Id>
<BrokerDate StandardDate="4Q 2019A" Broker_Id="2" BrokerTab_Id="3">
<year>4Q19</year>
<Quater>4Q19</Quater>
</BrokerDate>
</TickerBrokerDateFormatMap>
</Broker>
</TickerBrokerDateMap>

now some time there is duplicate data in TickerBrokerDateFormatMap. duplicate will be based on StandardDate, BrokerTab_Id and Broker_Id

if there are multiples data having same StandardDate, BrokerTab_Id and Broker_Id then that will be deleted from xml file.

This way i try to find duplicate data first which not perfect i guess. so please see my code and help me to complete my objective

my code

XDocument xmlDocTargetFile = XDocument.Load(strADMFilePath);

var stGroup = xmlDocTargetFile.Descendants("TickerBrokerDateFormatMap").GroupBy(row =>
new
{
s = row.Element("StandardDate").Value,
b = row.Attribute("Broker_Id").Value,
bt = row.Attribute("BrokerTab_Id").Value
},
(key, gr) => new { key, list = gr }
);

foreach (var item in stGroup)
{
_lstDuplicates.Add(new TickerBrokerDateFormatMap
{
StandardDate = item.key.s,
BrokerTab_Id = item.key.bt,
Broker_Id = item.key.b
});
}
dgList.DataSource = _lstDuplicates;

_lstDuplicates suppose to store all duplicate data based on StandardDate, BrokerTab_Id and Broker_Id

if data were stored in db then i used to query this way

select StandardDate,Broker_Id,BrokerTab_Id,count(*)
from tmp_TickerBrokerDateFormatMap
group by StandardDate,Broker_Id,BrokerTab_Id
having count(*)>1
order by Broker_Id,BrokerTab_Id,StandardDate

so tell me how could i use LINQ to query my xml file to find duplicate data based on StandardDate, BrokerTab_Id and Broker_Id

and remove those duplicate data?

please help me with code sample. thanks

Continue reading...
 
Back
Top