multithreading - Synchronise muti-threads in Python -
multithreading - Synchronise muti-threads in Python -
the class brokenlinktest in code below following.
takes web page url finds links in web page get headers of links concurrently (this done check if link broken or not) print 'completed' when headers received.from bs4 import beautifulsoup import requests class brokenlinktest(object): def __init__(self, url): self.url = url self.thread_count = 0 self.lock = threading.lock() def execute(self): soup = beautifulsoup(requests.get(self.url).text) self.lock.acquire() link in soup.find_all('a'): url = link.get('href') threading.thread(target=self._check_url(url)) self.lock.acquire() def _on_complete(self): self.thread_count -= 1 if self.thread_count == 0: #check if threads completed self.lock.release() print "completed" def _check_url(self, url): self.thread_count += 1 print url result = requests.head(url) print result self._on_complete() brokenlinktest("http://www.example.com").execute() can concurrency/synchronization part done in improve way. did using threading.lock. first experiment python threading.
def execute(self): soup = beautifulsoup(requests.get(self.url).text) threads = [] link in soup.find_all('a'): url = link.get('href') t = threading.thread(target=self._check_url, args=(url,)) t.start() threads.append(t) thread in threads: thread.join()
you utilize join method wait threads finish.
note added start call, , passed bound method object target param. in original illustration calling _check_url in main thread , passing homecoming value target param.
python multithreading synchronization
Comments
Post a Comment