How to use Python3 to download a Pling package in a reliable manner?


#1

I would like to use a Python3 script to download a package from Pling in a reliable manner. Below is my script.
The url used was obtained from a previous download. It used to work, but now it does not work. I tried going to the package website (https://www.gnome-look.org/p/1102582 click on File Tab) to download the package. It worked once. Thereafter, the manual download would not work anymore.

How can I download a package from www.gnome-look.org in a reliable manner? The intention here is to automate the installation of the package in Ubuntu 18.04.

#!/usr/bin/python3.6
# -*- coding: utf-8 -*-

from pathlib import Path
from io import BytesIO
from urllib.request import Request, urlopen
from urllib.error import URLError
import tarfile


def get_url_response( url ):
    req = Request( url )
    try:
        response = urlopen( req )
    except URLError as e:
        if hasattr( e, 'reason' ):
            print( 'We failed to reach a server.' )
            print( 'Reason: ', e.reason )
    else:
        # everything is fine
        return response
url = 'https://dl.opendesktop.org/api/files/download/id/1563810337/s/a095c4c96dfa61e69c56b78027672158576c35bf6c89887a36ea8e4fbddefe0c8137840a06d21a234218ee2ad8c633334356669b3b2d49440722a114f45fbada/t/1566985818/lt/download/Cupertino.tar.xz'
dst = Path().cwd() / 'tmp'

response = get_url_response( url )
print( response.info() )
with tarfile.open( fileobj=BytesIO( response.read() ), mode='r:xz' ) as tfile:
    tfile.extractall( path=dst )

Output:

Date: Wed, 04 Sep 2019 15:47:01 GMT
Server: Apache
Vary: Host,Accept-Encoding
Set-Cookie: PlingItId=tc3gg9is7kq9ta5ljlcer6f0r3; expires=Thu, 03-Sep-2020 15:47:01 GMT; Max-Age=31536000; path=/; domain=www.pling.com; HttpOnly
Expires: Wed, 04 Sep 2019 16:17:01 GMT
Cache-Control: private, no-cache, must-revalidate
Pragma: no-cache
X-Frame-Options: ALLOWALL
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html


Traceback (most recent call last):
  File "/usr/lib/python3.6/tarfile.py", line 1700, in xzopen
    t = cls.taropen(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.6/tarfile.py", line 1619, in taropen
    return cls(name, mode, fileobj, **kwargs)
  File "/usr/lib/python3.6/tarfile.py", line 1482, in __init__
    self.firstmember = self.next()
  File "/usr/lib/python3.6/tarfile.py", line 2297, in next
    tarinfo = self.tarinfo.fromtarfile(self)
  File "/usr/lib/python3.6/tarfile.py", line 1092, in fromtarfile
    buf = tarfile.fileobj.read(BLOCKSIZE)
  File "/usr/lib/python3.6/lzma.py", line 200, in read
    return self._buffer.read(size)
  File "/usr/lib/python3.6/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.6/_compression.py", line 103, in read
    data = self._decompressor.decompress(rawblock, size)
_lzma.LZMAError: Input format not supported by decoder

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/master/.0MySetup/setupUbuntu18.04/test_tar_2.py", line 30, in <module>
    with tarfile.open( fileobj=BytesIO( response.read() ), mode='r:xz' ) as tfile:
  File "/usr/lib/python3.6/tarfile.py", line 1589, in open
    return func(name, filemode, fileobj, **kwargs)
  File "/usr/lib/python3.6/tarfile.py", line 1704, in xzopen
    raise ReadError("not an lzma file")
tarfile.ReadError: not an lzma file

#2

Two errors:

  1. It points that the downloaded file is not a LZMA file, so the library tarfile does not support it.
  2. The error is print( 'Error code: ', e.code ).

#3

Thank you for identifying the errors.

On Error 2: I concur. This script was taken from documentation. After reading more about the urllib.error module, I realised that the exception urllib.error.URLError has only one attribute, i.e. “.reason”. I have removed

    elif hasattr( e, 'code'):
        print( 'The server couldn\'t fulfill the request.' )
        print( 'Error code: ', e.code )

On Error 1: I find myself in a dilemma. Originally, the website provided the hyperlink to the .tar.xz file as:

url = 'https://dl.opendesktop.org/api/files/download/id/1563810337/s/a095c4c96dfa61e69c56b78027672158576c35bf6c89887a36ea8e4fbddefe0c8137840a06d21a234218ee2ad8c633334356669b3b2d49440722a114f45fbada/t/1566985818/lt/download/Cupertino.tar.xz'

It worked. I was able to download the tar file. However, after some time the link failed to work. I just tried to download the same file manually “Cupertino.tar.xz” via the link in the download column. It now provided me with another hyperlink to successfully download the file.

url = 'https://dl.opendesktop.org/api/files/download/id/1563810337/s/f64a2cf2d0fe040db76c01a38a90882a060be2f8fafa5fd2892f9eef6b8dcd5e1b1c1bb772929586f67a071577fbd706d963886ec1d759f87dfe9492075a2a07/t/1567653482/c/8d9cc68f0dd7ffabeea461394efb9e8b72cc0162c73a65c41bceb8709a21d18b6571bd032384ec624eab379abfd69dfa038b8233f1b3c8c56e565e6f13402dbd/lt/download/Cupertino.tar.xz'

Why does the hyperlink change with time? How can I create a stable python script to download the .tar file?


#4

Hi tesla33,

we currently change the handling of direct dl urls to prevent misuse scripts from inflating download numbers.
Within the next 1-2 weeks, we are going to adjust the download system to allow such actions again, since registered user downloads and unknown external downloads will be handled seperately, unless we get also a user information together with the ocs api action.


#5

Thank you for the good new. On the practical side, how would the changes that you will be implementing affect the code that I had posted above? Is there additional info that I need to provide?