Parallel For Each loop CSV output issues

  • Thread starter Thread starter deskcheck1
  • Start date Start date
D

deskcheck1

Guest
Hi,

This method predicts a stream bed type and calculates the size of its bedrock using a machine learning model.

The ideal result I need is for each numeric value written to a CSV file to be in float type. Instead the SLOPE value is displayed in scientific number. Sample CSV output is below:

COMID,LENGTHKM,SLOPE,MUID,NA_L3NAME,BedRockDepth_mm,DrainageArea_km,Width,Depth,STRBED, D50_mm,Accuracy %
661274,1.843,1E-05,2511840,Wasatch and Uinta Mountains,0,9.2007,5.89697,0.481299,sand,2,63.93
661276,0.055,1E-05,2511840,Wasatch and Uinta Mountains,0,9.2088,5.89879,0.481389,sand,2,85.64
661278,0.241,1E-05,2511840,Wasatch and Uinta Mountains,0,0.1422,1.35889,0.198011,sand,2,81.02
661280,0.847,1E-05,2511840,Wasatch and Uinta Mountains,0,13.0761,6.67366,0.518718,sand,2,64.31

The actual output right now is like this:

COMID,LENGTHKM,SLOPE,MUID,NA_L3NAME,BedRockDepth_mm,DrainageArea_km,Width,Depth,STRBED, D50_mm,Accuracy %
18482048,0.063,1E-05,162362,Eastern Corn Belt Plains,2000,12.5019,6.569,0.51378,sand,2,46.2518482048,0.063,1E-05,162362,Eastern Corn Belt Plains,2000,12.5019,6.569,0.51378,sand,2,46.25
18482048,0.063,1E-05,162362,Eastern Corn Belt Plains,2000,12.5019,6.569,0.51378,sand,2,46.25
COMID,LENGTHKM,SLOPE,MUID,NA_L3NAME,BedRockDepth_mm,DrainageArea_km,Width,Depth,STRBED, D50_mm,Accuracy %
18482048,0.063,1E-05,162362,Eastern Corn Belt Plains,2000,12.5019,6.569,0.51378,sand,2,46.2518482048,0.063,1E-05,162362,Eastern Corn Belt Plains,2000,12.5019,6.569,0.51378,sand,2,46.25
18482048,0.063,1E-05,162362,Eastern Corn Belt Plains,2000,12.5019,6.569,0.51378,sand,2,46.25

The header is somehow repeated. I need for the header to write only once at the top. Also I need carriage return for each line.

My parallel code is below:

private static void PerformTestParallel(string inputFile, string outputFile)
{
//Read initial time
Stopwatch sw = new Stopwatch();
sw.Start();

//Open and read input file
List<string> testData = LoadCsvFile(inputFile);

var input = new ModelInput();

var header = string.Format("{0},{1},{2},{3},{4},{5},{6},{7},{8},{9}, {10},{11}\r\n", "COMID", "LENGTHKM", "SLOPE",
"MUID", "NA_L3NAME", "BedRockDepth_mm", "DrainageArea_km", "Width", "Depth", "STRBED", "D50_mm", "Accuracy %");

int i = 0;
int len = testData.Count;
ConsoleUtility.WriteProgressBar2(0, 0);

using (StreamWriter writer = new StreamWriter(new FileStream(outputFile, FileMode.Create)))
{
writer.Write(header);

// Parallelize the outer loop to partition the source array by rows.
Parallel.ForEach(testData, (item) =>
{
testData.Skip(1);
i++;

ConsoleUtility.WriteProgressBar2(i, len, true);
Thread.Sleep(50);

if (item != null)
{
string[] line = item.Split(',');

input.COMID = Convert.ToSingle(line[0]);
input.LENGTHKM = Convert.ToSingle(line[1]);
input.SLOPE = Convert.ToSingle(line[2]);
input.MUID = Convert.ToSingle(line[3]);
input.NA_L3NAME = line[4].ToString();
input.BROCKDEPMIN_mm = Convert.ToSingle(line[5]);
input.DRAINAGE_AREAKM = Convert.ToSingle(line[6]);
input.WIDTH = Convert.ToSingle(line[7]);
input.DEPTH = Convert.ToSingle(line[8]);
}

// Make a single prediction on the test data and write results to CSV file
ModelOutput result = ConsumeModel.Predict(input);
input.STRBED = result.Prediction;

var scores = result.Score.ToArray();
Array.Sort(scores);

double maxValue = Math.Round(Convert.ToDouble(scores.Max() * 100), 2);

var d50size = CalculateD50mm(input.STRBED);

//Write to string
var newLine = string.Format("{0},{1},{2},{3},{4},{5},{6},{7},{8},{9},{10},{11}", input.COMID, input.LENGTHKM,
input.SLOPE, input.MUID, input.NA_L3NAME, input.BROCKDEPMIN_mm, input.DRAINAGE_AREAKM, input.WIDTH, input.DEPTH,
input.STRBED, d50size, maxValue);

writer.WriteLine(newLine);
writer.Flush();
});

}
}

Appreciate any help.




Marilyn Gambone

Continue reading...
 
Back
Top