How to visit all web pages using SemaphoreSlim

  • Thread starter Thread starter zydjohn
  • Start date Start date
Z

zydjohn

Guest
Hello:
I have the following C# code to visit some web sites: (about 20 to 30)

private static SemaphoreSlim _semaphore = new SemaphoreSlim(3);
private static List<string> total_pages_urls = new List<string>();
public static async Task Browse_Page1(string page_url1)
{
_semaphore.Wait();
try
{
_page = await _proxy_browser.NewPageAsync();
await _page.SetViewportAsync((new ViewPortOptions { Width = 1920, Height = 3938 }));
await _page.GoToAsync(page_url1);
await _page.WaitForTimeoutAsync(10000);
}
finally
{
_semaphore.Release();
}
}

for (int i = 1; i <= 20; i++)
{
string page_url1 = string.Format("https://www.mysite.com/today/{0}", i);
total_pages_urls.Add(page_url1);
}
List<Task> all_page_tasks = new List<Task>();
foreach (string page_url1 in total_pages_urls)
{
Task page_task1 = Browse_Page1(page_url1);
all_page_tasks.Add(page_task1);
}
Task.WaitAll(all_page_tasks.ToArray());

The issue is, no matter what is my settings, each iteration, there are some web sites not vistied.
I got error message like the following:
Inner Exception 1:
NavigationException: net::ERR_ABORTED at https://www.mysite.com/today/10
https://www.mysite.com/today/10
For the total 20 web pages, there should be 3 to 5 pages not visited. The error is: timeout.
Is there any way that I collected all the missing pages (3 to 5) and build a new list of Tasks and run the task again, I hope the second iteration, all the missing pages get visited.
By the way, I am using Visual Studio 2019 Version 16.3.5 targetting .Net Core 3.0.
Please advice!

Continue reading...
 
Back
Top