M
mclagett
Guest
Hi --
This one I don't really understand. In the code below, everything executes okay through the binding of filmResults, and when I set a breakpoint on the next line, I can see that the filmResults array contains two values. Drilling into these, I can see that each value (of type HtmlProvider<...>) has a tables collection and each of these has a table named Table6. Yet when the Array.map function in the binding to detailsTables iterates over the array, it finds the first of these Table6 tables fine, but the second one throws an IndexOutOfRangeException. Has anyone encountered something like this before?
I can't for the life of me see why anything would be wrong, unless for some reason after processing the first Table6, the library is expecting the second Table6 to have the same structure as the first one. They do have different structures; the first has five columns and the second has three columns (each call to the details page brings back a table that is structured for the details available for the particular id whose details are being fetched).
Any thoughts would be greatly appreciated, as I am sort of dead in the water at the moment.
type LumiereFilmStartingWith = HtmlProvider<"http://lumiere.obs.coe.int/web/films/index.php?letter=A">
type LumiereFilmReleaseDetail = HtmlProvider<"LUMIERE : Film: À 14 ans">
type Lumiere() =
member public this.StartingWithA() =
let thePage = LumiereFilmStartingWith.GetSample()
let tables = thePage.Tables
let html = thePage.Html
let ids =
html.Descendants ["a"]
|> Seq.choose(fun x ->
x.TryGetAttribute("href")
|> Option.map(fun a -> a.Value()))
|> Seq.filter(fun h -> h.Contains("?id="))
|> Seq.map(fun h ->
let delimiterIndex = h.LastIndexOf("?id=")
h.Substring(delimiterIndex+4))
|> Seq.map(fun s -> (int) s)
|> Seq.toArray
let startingWithA = tables.``ABCDEFGHIJKLMNOPQRSTUVWXYZ[0-9]``.Rows
|> Array.mapi (fun i f -> {id = ids.; film = f.Film; directors = f.Directors})
let filmResults = startingWithA
|> Array.map(fun fd ->
let url = "http://lumiere.obs.coe.int/web/film_info/?id=" + fd.id.ToString()
LumiereFilmReleaseDetail.AsyncLoad url)
|> Array.take 2
|> Async.Parallel
|> Async.RunSynchronously
let detailsTables = filmResults
|> Array.map(fun p -> p.Tables.Table6)
let filmDetails = detailsTables
|> Array.map(fun t -> seq {for i in 1 .. (t.Rows.Length-1) do yield JsonConvert.SerializeObject(t.Rows.) })
startingWithA
Just in case it is helpful, here are the two HTML snippets from each of the tables' Html property:
<table class="fixed_layout_100">
<thead>
<tr>
<th align="CENTER">Market</th><th align="CENTER">Distributor</th><th align="CENTER">Release date</th><th align="RIGHT">2015</th><th align="CENTER">Total since 2015</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="RIGHT">
<a href="?id=64128&market=FR" target="_top" title="Admissions (Market : France)">FR</a>
</td><td align="CENTER">Ad Vitam</td><td align="CENTER">25/02/2015</td><td align="RIGHT">10 832</td><td align="RIGHT">10 832</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR EU</a>
</td><td align="CENTER"> </td><td align="CENTER"> </td><td align="RIGHT">10 832</td><td align="RIGHT">10 832</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR OBS</a>
</td><td align="CENTER"> </td><td align="CENTER"> </td><td align="RIGHT">10 832</td><td align="RIGHT">10 832</td>
</tr>
</tbody>
</table>
<table class="fixed_layout_100">
<thead>
<tr>
<th align="CENTER">Market</th><th align="RIGHT">2009</th><th align="CENTER">Total since 2006</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="RIGHT">
<a href="?id=33601&market=PT" target="_top" title="Admissions (Market : Portugal)">PT</a>
</td><td align="RIGHT">11</td><td align="RIGHT">11</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR EU</a>
</td><td align="RIGHT">11</td><td align="RIGHT">11</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR OBS</a>
</td><td align="RIGHT">11</td><td align="RIGHT">11</td>
</tr>
</tbody>
</table>
Continue reading...
This one I don't really understand. In the code below, everything executes okay through the binding of filmResults, and when I set a breakpoint on the next line, I can see that the filmResults array contains two values. Drilling into these, I can see that each value (of type HtmlProvider<...>) has a tables collection and each of these has a table named Table6. Yet when the Array.map function in the binding to detailsTables iterates over the array, it finds the first of these Table6 tables fine, but the second one throws an IndexOutOfRangeException. Has anyone encountered something like this before?
I can't for the life of me see why anything would be wrong, unless for some reason after processing the first Table6, the library is expecting the second Table6 to have the same structure as the first one. They do have different structures; the first has five columns and the second has three columns (each call to the details page brings back a table that is structured for the details available for the particular id whose details are being fetched).
Any thoughts would be greatly appreciated, as I am sort of dead in the water at the moment.
type LumiereFilmStartingWith = HtmlProvider<"http://lumiere.obs.coe.int/web/films/index.php?letter=A">
type LumiereFilmReleaseDetail = HtmlProvider<"LUMIERE : Film: À 14 ans">
type Lumiere() =
member public this.StartingWithA() =
let thePage = LumiereFilmStartingWith.GetSample()
let tables = thePage.Tables
let html = thePage.Html
let ids =
html.Descendants ["a"]
|> Seq.choose(fun x ->
x.TryGetAttribute("href")
|> Option.map(fun a -> a.Value()))
|> Seq.filter(fun h -> h.Contains("?id="))
|> Seq.map(fun h ->
let delimiterIndex = h.LastIndexOf("?id=")
h.Substring(delimiterIndex+4))
|> Seq.map(fun s -> (int) s)
|> Seq.toArray
let startingWithA = tables.``ABCDEFGHIJKLMNOPQRSTUVWXYZ[0-9]``.Rows
|> Array.mapi (fun i f -> {id = ids.; film = f.Film; directors = f.Directors})
let filmResults = startingWithA
|> Array.map(fun fd ->
let url = "http://lumiere.obs.coe.int/web/film_info/?id=" + fd.id.ToString()
LumiereFilmReleaseDetail.AsyncLoad url)
|> Array.take 2
|> Async.Parallel
|> Async.RunSynchronously
let detailsTables = filmResults
|> Array.map(fun p -> p.Tables.Table6)
let filmDetails = detailsTables
|> Array.map(fun t -> seq {for i in 1 .. (t.Rows.Length-1) do yield JsonConvert.SerializeObject(t.Rows.) })
startingWithA
Just in case it is helpful, here are the two HTML snippets from each of the tables' Html property:
<table class="fixed_layout_100">
<thead>
<tr>
<th align="CENTER">Market</th><th align="CENTER">Distributor</th><th align="CENTER">Release date</th><th align="RIGHT">2015</th><th align="CENTER">Total since 2015</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="RIGHT">
<a href="?id=64128&market=FR" target="_top" title="Admissions (Market : France)">FR</a>
</td><td align="CENTER">Ad Vitam</td><td align="CENTER">25/02/2015</td><td align="RIGHT">10 832</td><td align="RIGHT">10 832</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR EU</a>
</td><td align="CENTER"> </td><td align="CENTER"> </td><td align="RIGHT">10 832</td><td align="RIGHT">10 832</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR OBS</a>
</td><td align="CENTER"> </td><td align="CENTER"> </td><td align="RIGHT">10 832</td><td align="RIGHT">10 832</td>
</tr>
</tbody>
</table>
<table class="fixed_layout_100">
<thead>
<tr>
<th align="CENTER">Market</th><th align="RIGHT">2009</th><th align="CENTER">Total since 2006</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td align="RIGHT">
<a href="?id=33601&market=PT" target="_top" title="Admissions (Market : Portugal)">PT</a>
</td><td align="RIGHT">11</td><td align="RIGHT">11</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR EU</a>
</td><td align="RIGHT">11</td><td align="RIGHT">11</td>
</tr>
<tr class="footer">
<td align="RIGHT">
<a href="/web/iso_codes/">EUR OBS</a>
</td><td align="RIGHT">11</td><td align="RIGHT">11</td>
</tr>
</tbody>
</table>
Continue reading...