banner



How To Download Multiple Files On Downloadani.me

Last Updated on September 12, 2022

The ThreadPoolExecutor grade in Python tin can be used to download multiple files at the same time.

This tin dramatically speed-upwardly the download procedure compared to downloading each file sequentially, one by one.

In this tutorial, yous volition notice how to concurrently download multiple files from the internet using threads in Python.

Subsequently completing this tutorial, you will know:

  • How to download files from the internet ane-by-one in Python and how slow information technology can be.
  • How to use the ThreadPoolExecutor to manage a pool of worker threads.
  • How to update the code to download multiple files at the aforementioned time and dramatically advance the procedure.

Let'south swoop in.

How to Download Files From the Net (slowly)

Downloading files from the internet is a common task.

For instance:

  • You may need to download all documents for offline reference.
  • Yous might need to create a backup of all .null files for a project.
  • You lot might need a local copy of a file annal.

If you did not know how to code, and so you might solve this problem by loading the webpage in a browser, then clicking each file in turn to salve it to your hard drive.

For more than a few files, this could take a long time to start click each file, and and so to wait for all downloads to complete.

Thankfully, we're developers, so we can write a script to first detect all of the files on a webpage to download, then download and shop them all locally.

Some examples of spider web pages that list files we might to download files include:

  • All versions of the Beautiful Soup library version 4.
  • All bot modifications for the Quake 1 computer game.
  • All documents for Python version 3.ix.7.

In these examples, in that location is a single HTML webpage that provides relative links to locally hosted files.

Ideally, the site would link to multiple minor or modestly sized files and the server itself would allow multiple connections from each client.

This may not e'er exist the case every bit most servers limit the number of connections per client to ten or even 1 to prevent denial of service attacks.

We can develop a plan to download the files one past ane.

There are few parts to this plan; for example:

  1. Download URL File
  2. Parse HTML for Files to Download
  3. Salvage Download to Local File
  4. Coordinate Download Process
  5. Complete Example

Let's look at each piece in turn.

Note: nosotros are only going to implement the most basic error handling. If you lot change the target URL, you may need to suit the code for the specific details of the HTML and files you wish to download.

Download URL File

The first pace is to download a file specified by a URL.

There are many means to download a URL in Python. In this example, we will apply the urlopen ( ) function to open the connection to the URL and phone call the read ( ) function to download the contents of the file into retentivity.

To ensure the connection is closed automatically one time nosotros are finished downloading, we will use a context manager, e.thousand. the with keyword.

Yous tin can learn more about opening and reading from URL connections in the Python API hither:

  • urllib.request – Extensible library for opening URLs

The download_url ( ) function below implements this, taking a URL and returning the contents of the file.

# load a file from a URL, returns content of downloaded file

def download_url ( urlpath ) :

# open a connection to the server

with urlopen ( urlpath ) equally connection :

# read the contents of the url as bytes and return it

return connection . read ( )

Nosotros volition use this office to download the HTML page that lists the files and to download the contents of each file listed.

An comeback on this function would be to add together a timeout on the connection, perhaps after a few seconds. This will throw an exception if the host does not reply and is a good practice when opening connections to remote servers.

Parse HTML for Files to Download

Once the HTML page of URLs is downloaded, we must parse it and extract all of the links to files.

I recommend the BeautifulSoup Python library anytime HTML documents need to exist parsed.

If you lot're new to BeautifulSoup, you tin can install information technology easily with your Python package manager, such as pip:

pip install beautifulsoup4

First, we must decode the raw data downloaded into ASCII text.

This tin be accomplished by calling the decode ( ) function on the string of raw data and specifying a standard text format, such as UTF-8.

. . .

# decode the provided content as ascii text

html = content . decode ( 'utf-8' )

Side by side, we can parse the text of the HTML document using BeautifulSoup using the default parser.

It is a proficient idea to apply a more sophisticated parser that works the aforementioned on all platforms, but in this case, will use the default parser as it does not require y'all to install annihilation extra.

. . .

# parse the document as best we can

soup = BeautifulSoup ( html , 'html.parser' )

We can so retrieve all <a href=""> HTML tags from the document as these will contain the URLs for the files we wish to download.

. . .

# notice all all of the <a href=""> tags in the document

atags = soup . find_all ( 'a' )

Nosotros can then iterate through all of the plant tags and retrieve the contents of the href property on each, e.chiliad. get the links to files from each tag. If the tag does not contain an href holding (which would be odd, e.grand. an anchor link), then we volition instruct the function to return None .

This tin can exist down in a list comprehension, giving us a list of URLs to files to download and possibly some None values.

. . .

# get all href values (links) or None if non present (unlikely)

return [ t . go ( 'href' , None ) for t in atags ]

Tying this all together, the get_urls_from_html ( ) function beneath takes the downloaded HTML content and returns a list of file URLs.

# decode downloaded html and excerpt all <a href=""> links

def get_urls_from_html ( content ) :

# decode the provided content as ascii text

html = content . decode ( 'utf-8' )

# parse the document equally best we can

soup = BeautifulSoup ( html , 'html.parser' )

# find all all of the <a href=""> tags in the document

atags = soup . find_all ( 'a' )

# become all href values (links) or None if not present (unlikely)

return [ t . go ( 'href' , None ) for t in atags ]

We will employ this part to extract all of the files from the HTML page that lists the files we wish to download.

Save Download to Local File

Nosotros know how to download files and get links out of HTML. Another important piece nosotros need is to save downloaded URLs to local files.

This can be done using the open ( ) built-in Python function.

We volition open the file in write style and binary format, every bit we will likely be downloading .null files or similar.

As with downloading URLs higher up, we will utilise the context manager (the with keyword) to ensure that the connection to the file is closed in one case we are finished writing the contents.

The save_file ( ) part below implements this, taking the local path for saving the file and the content of the file downloaded from a URL.

# save provided content to the local path

def save_file ( path , data ) :

# open the local file for writing

with open ( path , 'wb' ) every bit file :

# write all provided information to the file

file . write ( data )

You may want to add mistake handling here, such as a failure to write or the case of the file already existing locally.

We need to do some work in lodge to be able to utilize this function.

For example, we demand to check that we accept a link from the href property (that information technology's not None ), and that the URL we are trying to download is a file (e.g. has a .zip or .gz extension). This is some primitive fault checking or filtering over the types of URLs we would like to download.

. . .

# skip bad urls or bad filenames

if link is None or link == '../' :

return

# check for no file extension

if non ( link [ - 4 ] == '.' or link [ - three ] == '.' ) :

return

HTML pages that list files typically include relative rather than absolute links.

That is, the links to the files to be downloaded volition exist relative to the HTML file. We must convert them to accented links (with http://... at the front) in society to exist able to download the files.

We can apply the urljoin ( ) function from the urllib . parse module to convert any relative URLs to absolute URLs for us and then that we can download the file.

. . .

# convert relative link to absolute link

absurl = urljoin ( url , link )

The absolute URL can then be downloaded using our download_url ( ) role developed above.

. . .

# download the content of the file

information = download_url ( absurl )

Side by side, we need to decide the proper name of the file from the URL that we will be saving locally. This tin can be achieved using the basename ( ) function from the bone . path module.

. . .

# become the filename

filename = basename ( absurl )

We tin then determine the local path for saving the file past combining the local directory path with the filename. The join ( ) function from the os.path module volition practise this for u.s.a. correctly based on our platform (Unix, Windows, MacOS).

. . .

# construct the output path

outpath = join ( path , filename )

Nosotros now have plenty information to salvage the file past calling our save_file ( ) role.

. . .

# save to file

save_file ( outpath , data )

The download_url_to_file ( ) function below ties this all together, taking the URL for the HTML page that lists files, a link to one file on that page to download, and the local path for saving files. It downloads the URL, saves information technology locally as a file, and returns the relative link and local file path in a tuple.

If the file was not saved, we return a tuple with just the relative URL and None for the local path.

We choose to return some data from this role then that the caller can report progress about the success or failure of downloading each file.

1

2

three

four

v

6

7

viii

9

x

11

12

thirteen

14

15

16

17

18

19

20

# download one file to a local directory

def download_url_to_file ( url , link , path ) :

# skip bad urls or bad filenames

if link is None or link == '../' :

render ( link , None )

# check for no file extension

if not ( link [ - 4 ] == '.' or link [ - 3 ] == '.' ) :

return ( link , None )

# catechumen relative link to absolute link

absurl = urljoin ( url , link )

# download the content of the file

information = download_url ( absurl )

# get the filename

filename = basename ( absurl )

# construct the output path

outpath = join ( path , filename )

# save to file

save_file ( outpath , data )

# render results

return ( link , outpath )

Coordinate Download Process

Finally, we need to coordinate the overall process.

Start, the URL for the HTML folio must be downloaded using our download_url ( ) part defined in a higher place.

. . .

# download the html webpage

data = download_url ( url )

Adjacent, we need to create whatsoever directories on the local path where we take chosen to shop the local files. We can practise this using the makedirs ( ) function from the os module and ignore the case where the directories already exist.

. . .

# create a local directory to save files

makedirs ( path , exist_ok = Truthful )

Side by side, we will call up a list of all relative links to files listed on the HTML page by calling our get_urls_from_html ( ) function, and report how many links were found.

. . .

# parse html and recall all href urls listed

links = get_urls_from_html ( data )

# report progress

impress ( f 'Found {len(links)} links in {url}' )

Finally, we volition iterate through the list of links, download each to a local file with a telephone call to our download_url_to_file ( ) function, so report the success or failure of the download.

. . .

# download each file on the webpage

for link in links :

# download the url to a local file

link , outpath = download_url_to_file ( url , link , path )

# check for a link that was skipped

if outpath is None :

print ( f '>skipped {link}' )

else :

impress ( f 'Downloaded {link} to {outpath}' )

Tying this together, the download_all_files ( ) part below takes a URL to an HTML webpage that lists files and a local path to download files and downloads all files listed on the page, reporting progress along the fashion.

ane

2

3

4

5

6

7

8

9

10

11

12

13

xiv

15

sixteen

17

xviii

19

# download all files on the provided webpage to the provided path

def download_all_files ( url , path ) :

# download the html webpage

data = download_url ( url )

# create a local directory to save files

makedirs ( path , exist_ok = True )

# parse html and call up all href urls listed

links = get_urls_from_html ( information )

# written report progress

impress ( f 'Found {len(links)} links in {url}' )

# download each file on the webpage

for link in links :

# download the url to a local file

link , outpath = download_url_to_file ( url , link , path )

# cheque for a link that was skipped

if outpath is None :

impress ( f '>skipped {link}' )

else :

print ( f 'Downloaded {link} to {outpath}' )

Complete Example

We now have all of the elements to download all files listed on an HTML webpage.

Let's examination it out.

Cull one of the URLs listed above and utilise the functions that we have developed to download all files listed.

In this case, I will cull to download all bot modifications developed by third parties for the Quake ane computer game.

. . .

# url of html page that lists all files to download

URL = 'https://www.quaddicted.com/files/idgames2/quakec/bots/'

Below is an example of what this webpage looks like.

Screenshot of the HTML Webpage that Lists all Convulse 1 Bots

We volition salve all files in a local directory under a tmp/ subdirectory.

. . .

# local directory to relieve all files on the html page

PATH = 'tmp'

We can then call the download_all_files ( ) function that we developed above to kick off the download process.

. . .

# download all files on the html webpage

download_all_files ( URL , PATH )

Tying this together, the complete example of downloading all files on the HTML webpage is listed below.

i

ii

iii

4

5

6

7

eight

9

x

11

12

xiii

14

15

16

17

xviii

19

twenty

21

22

23

24

25

26

27

28

29

thirty

31

32

33

34

35

36

37

38

39

forty

41

42

43

44

45

46

47

48

49

fifty

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

lxxx

81

# SuperFastPython.com

# download all files from a website sequentially

from os import makedirs

from os . path import basename

from os . path import join

from urllib . asking import urlopen

from urllib . parse import urljoin

from bs4 import BeautifulSoup

# load a file from a URL, returns content of downloaded file

def download_url ( urlpath ) :

# open a connection to the server

with urlopen ( urlpath ) equally connection :

# read the contents of the url as bytes and return it

return connection . read ( )

# decode downloaded html and extract all <a href=""> links

def get_urls_from_html ( content ) :

# decode the provided content equally ascii text

html = content . decode ( 'utf-8' )

# parse the document as best we can

soup = BeautifulSoup ( html , 'html.parser' )

# notice all all of the <a href=""> tags in the document

atags = soup . find_all ( 'a' )

# get all href values (links) or None if non present (unlikely)

return [ t . get ( 'href' , None ) for t in atags ]

# save provided content to the local path

def save_file ( path , information ) :

# open the local file for writing

with open up ( path , 'wb' ) equally file :

# write all provided data to the file

file . write ( data )

# download one file to a local directory

def download_url_to_file ( url , link , path ) :

# skip bad urls or bad filenames

if link is None or link == '../' :

return ( link , None )

# bank check for no file extension

if not ( link [ - 4 ] == '.' or link [ - 3 ] == '.' ) :

return ( link , None )

# convert relative link to accented link

absurl = urljoin ( url , link )

# download the content of the file

data = download_url ( absurl )

# go the filename

filename = basename ( absurl )

# construct the output path

outpath = join ( path , filename )

# salve to file

save_file ( outpath , data )

# return results

return ( link , outpath )

# download all files on the provided webpage to the provided path

def download_all_files ( url , path ) :

# download the html webpage

data = download_url ( url )

# create a local directory to save files

makedirs ( path , exist_ok = True )

# parse html and retrieve all href urls listed

links = get_urls_from_html ( data )

# report progress

impress ( f 'Plant {len(links)} links in {url}' )

# download each file on the webpage

for link in links :

# download the url to a local file

link , outpath = download_url_to_file ( url , link , path )

# check for a link that was skipped

if outpath is None :

print ( f '>skipped {link}' )

else :

print ( f 'Downloaded {link} to {outpath}' )

# url of html page that lists all files to download

URL = 'https://www.quaddicted.com/files/idgames2/quakec/bots/'

# local directory to salvage all files on the html page

PATH = 'tmp'

# download all files on the html webpage

download_all_files ( URL , PATH )

Running the example will kickoff download the HTML webpage that lists all files, then download each file listed on the page into the tmp/ directory.

This will take a long time, perhaps 3-4 minutes, equally the files are downloaded one at a time.

As the programme runs, it will report useful progress.

First, it comments that 113 links were found on the folio and that some directories were skipped. Information technology and so reports the filenames and local paths of each file saved.

Institute 113 links in https://world wide web.quaddicted.com/files/idgames2/quakec/bots/

>skipped ../

>skipped eliminator/

>skipped reaper/

Downloaded aggressor.txt to tmp/attacker.txt

Downloaded attacker.zip to tmp/attacker.zip

Downloaded bgadm101.txt to tmp/bgadm101.txt

Downloaded bgadm101.zero to tmp/bgadm101.naught

Downloaded bgbot16.txt to tmp/bgbot16.txt

Downloaded bgbot16.cipher to tmp/bgbot16.naught

...

How long did it accept to run on your computer?
Allow me know in the comments below.

Next, nosotros volition wait at the ThreadPoolExecutor class that can be used to create a puddle of worker threads that will allow us to speed upwardly this download process.

How to Create a Pool of Worker Threads With ThreadPoolExecutor

We can utilise the ThreadPoolExecutor class to speed up the download of multiple files listed on an HTML webpage.

The ThreadPoolExecutor course is provided as function of the concurrent . futures module for easily running concurrent tasks.

The ThreadPoolExecutor provides a pool of worker threads, which is different from the ProcessPoolExecutor that provides a pool of worker processes.

More often than not, ThreadPoolExecutor should be used for concurrent IO-bound tasks, like downloading URLs, and the ProcessPoolExecutor should be used for concurrent CPU-jump tasks, like calculating.

Using the ThreadPoolExecutor was designed to be easy and straightforward. It is like the "automatic mode" for Python threads.

  1. Create the thread pool by calling ThreadPoolExecutor ( ) .
  2. Submit tasks and go futures past calling submit ( ) .
  3. Wait and get results equally tasks complete past calling as_completed ( ) .
  4. Shut downwardly the thread pool by calling shutdown ( )

Create the Thread Pool

First, a ThreadPoolExecutor case must exist created.

By default, information technology volition create a pool of threads that is equal to the number of logical CPU cores in your organisation plus four.

This is good for near purposes.

. . .

# create a thread puddle with the default number of worker threads

pool = ThreadPoolExecutor ( )

Yous tin run tens to hundreds of concurrent IO-bound threads per CPU, although perchance not thousands or tens of thousands.

Yous tin can specify the number of threads to create in the pool via the max_workers argument; for instance:

. . .

# create a thread pool with 10 worker threads

puddle = ThreadPoolExecutor ( max_workers = ten )

Submit Tasks to the Thread Pool

One time created, we can transport tasks into the pool to be completed using the submit ( ) function.

This function takes the name of the function to call any and all arguments and returns a Hereafter object.

The Future object is a promise to return the results from the chore (if any) and provides a fashion to make up one's mind if a specific job has been completed or not.

. . .

# submit a task

futurity = pool . submit ( my_task , arg1 , arg2 , . . . )

The render from a part executed by the thread pool can be accessed via the result ( ) function on the future object. Information technology will wait until the result is available, if needed, or render immediately if the result is bachelor.

For example:

. . .

# get the consequence from a future

result = future . effect ( )

Get Results every bit Tasks Complete

The beauty of performing tasks concurrently is that we can become results as they become available rather than waiting for tasks to exist completed in the guild they were submitted.

The concurrent . futures module provides an as_completed ( ) function that nosotros can employ to get results for tasks every bit they are completed, just like its name suggests.

Nosotros tin telephone call the function and provide it a list of time to come objects created by calling submit ( ) and information technology will render future objects as they are completed in whatever lodge.

For example, we can employ a list comprehension to submit the tasks and create the list of future objects:

. . .

# submit all tasks into the thread pool and create a list of futures

futures = [ puddle . submit ( my_func , task ) for task in tasks ]

Then get results for tasks equally they complete in a for loop:

. . .

# iterate over all submitted tasks and become results every bit they are bachelor

for future in as_completed ( futures ) :

# get the upshot

event = futurity . result ( )

# do something with the consequence...

Shutdown the Thread Pool

Once all tasks are completed, we can close downward the thread pool, which will release each thread and any resources it may concur (due east.k. the stack space).

. . .

# shutdown the thread pool

pool . shutdown ( )

An easier way to use the thread pool is via the context manager (the with keyword), which ensures information technology is closed automatically one time we are finished with it.

. . .

# create a thread pool

with ThreadPoolExecutor ( max_workers = ten ) every bit pool :

# submit tasks

futures = [ puddle . submit ( my_func , chore ) for task in tasks ]

# go results equally they are bachelor

for future in as_completed ( futures ) :

# get the result

result = future . result ( )

# exercise something with the result...

Now that nosotros are familiar with ThreadPoolExecutor and how to use information technology, allow'due south wait at how we tin adapt our plan for downloading URLs to make use of information technology.

Confused by the ThreadPoolExecutor form API?
Download my Costless PDF cheat sheet

How to Download Multiple Files Concurrently

The program for downloading URLs to file can be adapted to use the ThreadPoolExecutor with very picayune change.

The download_all_files ( ) function currently enumerates the listing of links extracted from the HTML folio and calls the download_url_to_file ( ) function for each.

This loop can be updated to submit ( ) tasks to a ThreadPoolExecutor object, and and so we tin expect on the future objects via a call to as_completed ( ) and report progress.

For example:

. . .

# create the pool of worker threads

with ThreadPoolExecutor ( max_workers = 20 ) equally exe :

# dispatch all download tasks to worker threads

futures = [ exe . submit ( download_url_to_file , url , link , path ) for link in links ]

# report results as they become bachelor

for futurity in as_completed ( futures ) :

# recollect result

link , outpath = time to come . outcome ( )

# cheque for a link that was skipped

if outpath is None :

print ( f '>skipped {link}' )

else :

print ( f 'Downloaded {link} to {outpath}' )

Here, we use xx threads for upwards to 20 concurrent connections to the server that hosts the files. This may need to be adapted depending on the number of concurrent connections supported past the server from which you lot wish to download files.

The updated version of the download_all_files ( ) office that uses the ThreadPoolExecutor is listed below.

i

2

iii

4

5

6

7

8

9

ten

11

12

13

14

15

16

17

xviii

nineteen

20

21

22

23

# download all files on the provided webpage to the provided path

def download_all_files ( url , path ) :

# download the html webpage

data = download_url ( url )

# create a local directory to salvage files

makedirs ( path , exist_ok = True )

# parse html and retrieve all href urls listed

links = get_urls_from_html ( data )

# report progress

print ( f 'Constitute {len(links)} links in {url}' )

# create the pool of worker threads

with ThreadPoolExecutor ( max_workers = 20 ) every bit exe :

# dispatch all download tasks to worker threads

futures = [ exe . submit ( download_url_to_file , url , link , path ) for link in links ]

# report results equally they go available

for hereafter in as_completed ( futures ) :

# recollect result

link , outpath = time to come . event ( )

# check for a link that was skipped

if outpath is None :

print ( f '>skipped {link}' )

else :

impress ( f 'Downloaded {link} to {outpath}' )

Tying this together, the complete example of downloading all quake ane bots concurrently using the ThreadPoolExecutor is listed below.

ane

ii

iii

4

5

6

vii

8

9

10

11

12

xiii

fourteen

15

sixteen

17

eighteen

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

l

51

52

53

54

55

56

57

58

59

lx

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

fourscore

81

82

83

84

85

86

87

# SuperFastPython.com

# download all files from a website sequentially

from os import makedirs

from bone . path import basename

from os . path import join

from urllib . asking import urlopen

from urllib . parse import urljoin

from concurrent . futures import ThreadPoolExecutor

from concurrent . futures import as_completed

from bs4 import BeautifulSoup

# load a file from a URL, returns content of downloaded file

def download_url ( urlpath ) :

# open a connection to the server

with urlopen ( urlpath ) as connection :

# read the contents of the url equally bytes and return it

return connection . read ( )

# decode downloaded html and extract all <a href=""> links

def get_urls_from_html ( content ) :

# decode the provided content as ascii text

html = content . decode ( 'utf-viii' )

# parse the certificate as best nosotros can

soup = BeautifulSoup ( html , 'html.parser' )

# detect all all of the <a href=""> tags in the document

atags = soup . find_all ( 'a' )

# get all href values (links) or None if not present (unlikely)

render [ t . get ( 'href' , None ) for t in atags ]

# save provided content to the local path

def save_file ( path , information ) :

# open the local file for writing

with open ( path , 'wb' ) equally file :

# write all provided data to the file

file . write ( data )

# download ane file to a local directory

def download_url_to_file ( url , link , path ) :

# skip bad urls or bad filenames

if link is None or link == '../' :

return ( link , None )

# check for no file extension

if not ( link [ - 4 ] == '.' or link [ - 3 ] == '.' ) :

render ( link , None )

# convert relative link to absolute link

absurl = urljoin ( url , link )

# download the content of the file

data = download_url ( absurl )

# get the filename

filename = basename ( absurl )

# construct the output path

outpath = join ( path , filename )

# save to file

save_file ( outpath , data )

# return results

return ( link , outpath )

# download all files on the provided webpage to the provided path

def download_all_files ( url , path ) :

# download the html webpage

data = download_url ( url )

# create a local directory to relieve files

makedirs ( path , exist_ok = True )

# parse html and remember all href urls listed

links = get_urls_from_html ( data )

# report progress

print ( f 'Found {len(links)} links in {url}' )

# create the puddle of worker threads

with ThreadPoolExecutor ( max_workers = 20 ) as exe :

# dispatch all download tasks to worker threads

futures = [ exe . submit ( download_url_to_file , url , link , path ) for link in links ]

# study results every bit they get available

for future in as_completed ( futures ) :

# call up outcome

link , outpath = future . event ( )

# check for a link that was skipped

if outpath is None :

print ( f '>skipped {link}' )

else :

print ( f 'Downloaded {link} to {outpath}' )

# url of html page that lists all files to download

URL = 'https://www.quaddicted.com/files/idgames2/quakec/bots/'

# local directory to save all files on the html page

PATH = 'tmp'

# download all files on the html webpage

download_all_files ( URL , PATH )

Running the example will first download and parse the HTML folio that lists all of the local files.

Each file on the page is then downloaded meantime, with up to 20 files beingness downloaded at the same time.

This dramatically speeds up the task, taking 10-15 seconds depending on your internet connexion, compared to 3-4 minutes for the sequential version. That is near a 33x speedup.

Progress is reported like final time, but we can meet that the files are reported out of gild.

Here, nosotros tin can come across that the smaller .txt files stop before any of the .zip files.

Constitute 113 links in https://www.quaddicted.com/files/idgames2/quakec/bots/

>skipped ../

>skipped eliminator/

>skipped reaper/

Downloaded bgadm101.txt to tmp/bgadm101.txt

Downloaded attacker.txt to tmp/attacker.txt

Downloaded bgbot16.txt to tmp/bgbot16.txt

Downloaded bgbot20a.txt to tmp/bgbot20a.txt

Downloaded borg12.txt to tmp/borg12.txt

Downloaded bplayer2.txt to tmp/bplayer2.txt

...

How long did it take to run on your estimator?
Let me know in the comments below.


Need assistance with the ThreadPoolExecutor?

Sign-upward to my Complimentary vii-twenty-four hour period email course and discover how to use the ThreadPoolExecutor class, including how to configure the number of workers, how to execute tasks asynchronously, and much more than!

Click the button below and enter your e-mail address to sign-upward and get the first lesson right now.

First Your FREE Email Grade Now!


Extensions

This section lists ideas for extending the tutorial.

  • Add connection timeout: Update the download_url ( ) function to add a timeout to the connection and handle the exception if thrown.
  • Add file saving error handling: Update save_file ( ) to handle the case of non beingness able to write the file and/or a file already existing at that location.
  • Merely download relative URLs: Add a check for the extracted links to ensure that the programme only downloads relative URLs and ignores absolute URLs or URLs for a unlike host.

Share your extensions in the comments beneath; it would be nifty to run across what you come up upwardly with.

Further Reading

This department provides additional resources that y'all may find helpful.

  • concurrent.futures - Launching parallel tasks
  • ThreadPoolExecutor: The Complete Guide
  • ThreadPoolExecutor Form API Cheat Sheet
  • Concurrent Futures API Interview Questions
  • ThreadPoolExecutor Jump-Get-go (my 7-mean solar day course)

Takeaways

In this tutorial, you learned how to download multiple files concurrently using a pool of worker threads in Python. You learned:

  • How to download files from the internet sequentially in Python and how boring it can be.
  • How to employ the ThreadPoolExecutor to manage a pool of worker threads.
  • How to update the lawmaking to download multiple files at the same time and dramatically advance the process.

Do you have any questions?
Exit your question in a comment below and I will reply fast with my best advice.

Photo by Berend Verheijen on Unsplash

How To Download Multiple Files On Downloadani.me,

Source: https://superfastpython.com/threadpoolexecutor-download-files/

Posted by: youngtoomen.blogspot.com

0 Response to "How To Download Multiple Files On Downloadani.me"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel