Home ยป Calculate SHA1 hash from binary and verify with a provided hash

Calculate SHA1 hash from binary and verify with a provided hash

Solutons:


I’m going to tell you the things in your code that made me wonder.

# Script that verifies the correctness of a toolbar offers

import urllib, bencode , hashlib

Extra space after bencode suggests lack of attention to detail.

from hashlib import sha1

url = "http://update.utorrent.com/installoffer.php?offer=conduit"

filename = "content.txt"

f= urllib.urlretrieve(url, filename)

Use of single-letter variables suggests lack of attention to clean code.

#Bdecoding the response

with open (str(filename), 'rb') as myfile:

filename is already a string. Converting it to a string again suggests confusion as to what is going on.

           data=myfile.read()
           decoded = bencode.bdecode(data)

Massive indentation suggests disregard for consistent indentation style.

Downloading the file to disk then reading it into memory is poor use of available APIs. You should download the file directly into memory.

# Returning the list for the key 'offer_urls'

Excess whitespace suggests someone is enter-happy.

list = decoded['offer_urls']

“Returning” is technically incorrect here. You aren’t in a function, so you aren’t returning anything. Use of list is a bad name for a variable since it’s a built-in Python type.

#print list


# Returning the URL of the binary that is prefixed with "http://download3.utorrent.com/"

length = len (list)

Putting spaces after function calls is odd, and you don’t do it consistently. There’s also rarely a good reason to store the length of a list.

prefix = "http://download3.utorrent.com/"

i= -1

No space before the =.

while i < length:
     i = i + 1
     if list[i].startswith(prefix) :
        break
binary= list[i]

You are using a while loop when you should be using a for loop. It is also going to cause an IndexError, rather than fail gracefully, if the prefix isn’t found.

print "The URL of the the binary is: " , binary


# Returning the sha1 hash contained in the value of the key 'offer_hash'


encrypted_hash = decoded['offer_hash']

Is that hash really encrypted?

sha1_hash1 = encrypted_hash.encode('hex')
print "The sha1 hash contained in the value of the key 'offer_hash' is: " , sha1_hash1 


# Downloading the binary and calculating its hash

urllib.urlretrieve ( binary , "utct2-en-conduit-20130523.exe")
file = "C:Python27utct2-en-conduit-20130523.exe"

Hard-coded filename will be easy to break. You shouldn’t even need to do it. The string also contains "" which should be escaped, or the whole string should be raw. i.e.

"C:\Python27\utct2-en-conduit-20130425.exe"

or

r"C:Python27utct2-en-conduit-20130425.exe"

or in most cases you can use the other slashes, even on Windows.

"C:/Python27/utct2-en-conduit-20130425.exe"

You got away with what you did, but that’s mostly because you are lucky.

with open (file, 'rb') as myfile:
           downloaded=myfile.read()

This last piece of code did pretty much the same thing as a previous bit of code, but you didn’t use a function.

k = hashlib.sha1()
k.update(downloaded)
sha1file = k.hexdigest()
print "The sha1 hash of the downloaded binary is: " , sha1file

# Verify that the calculated sha1 hash is identical to the one provided in the offer details

if (sha1file == sha1_hash1) :

You’ve put parentheses, which are not necessary.

    print "Test result = Pass"
else :
    print "Test result = Fail"

Your basic problems:

  1. Your style is poor and inconsistent
  2. A few things suggest you don’t quite understand what’s going on
  3. You haven’t taken care to make the code readable

Here’s my take on the problem:

# Script that verifies the correctness of a toolbar offers

import urllib
import bencode
import hashlib
import contextlib

# it's common practice to put input details like this
# as global constants at the start of your script
URL = "http://update.utorrent.com/installoffer.php?offer=conduit"
PREFIX = "http://download3.utorrent.com/"

# by breaking down your task into functions
# you can make the code easier to follow
def read_bencoded_url(url):
    with contextlib.closing(urllib.urlopen(url)) as offer_file:
        return bencode.bdecode(offer_file.read())

def find_string_with_prefix(strings, prefix):
    for string in strings:
        if string.startswith(prefix):
            return string
    else:
        raise Exception("Did not find prefix: {}".format(prefix))

# this function is a bit more complicated
# but it avoids saving the file to disk or loading it entirely into memory.
# instead it hashes it 4096 bytes at a time
def hash_of_url(url):
    hasher = hashlib.sha1()
    with contextlib.closing(urllib.urlopen(url)) as binary_file:
        while True:
            data = binary_file.read(4096)
            if not data:
                break
            hasher.update(data)
    return hasher.hexdigest()

# the actual high level logic just calls the functions
# this avoid obscuring the logic with lower level details
def main():
    decoded = read_bencoded_url(URL)
    binary = find_string_with_prefix(decoded['offer_urls'], PREFIX)
    reported_hash = decoded['offer_hash'].encode('hex')
    actual_hash = hash_of_url(binary)
    print "The sha1 hash contained in the value of the key 'offer_hash' is: ", reported_hash
    print "The sha1 hash of the downloaded binary is: " , actual_hash

    if reported_hash == actual_hash:
        print "Test result = Pass"
    else:
        print "Test result = Fail"

if __name__ == '__main__':
     main()

Run your code through pep8 and possibly a more pedantic static analyzer like pylint.

You will find these tools don’t like some of your formatting and variable names. They will likely complain there are too many variables, branches, etc. because the code is not broken up into modularized functions.

If I was to take your code then multiply it out 100 or a thousand times into other files (the size of a regular commercial/enterprise application then it would be a dog’s breakfast and not maintainable at all.

You need to take more care with your code and make it readable and consistent. Think of it as formatting your resume for a potential employer. Use consistent whitespace between characters, consistent comments above the code (not multiple line breaks between) and above all adopt one of the common style guides for the language. Winston posted an excellent example earlier.

This shows you can at least write code that reads and presents well. If you’re coding in a professional environment, other people have to read the code you wrote as well and they need to understand it quickly. If they’re scanning through the code then suddenly there’s a random comment floating in the page it’s a real distraction.

A good book to read is Clean Code by Robert Martin. You must read this as if your life depended on it.

Related Solutions

Don’t understand how my mum’s Gmail account was hacked

IMPORTANT: this is based on data I got from your link, but the server might implement some protection. For example, once it has sent its "silver bullet" against a victim, it might answer with a faked "silver bullet" to the same request, so that anyone...

What is /storage/emulated/0/?

/storage/emulated/0/Download is the actual path to the files. /sdcard/Download is a symlink to the actual path of /storage/emulated/0/Download However, the actual files are located in the filesystem in /data/media, which is then mounted to /storage/emulated/0...

How can I pass a command line argument into a shell script?

The shell command and any arguments to that command appear as numbered shell variables: $0 has the string value of the command itself, something like script, ./script, /home/user/bin/script or whatever. Any arguments appear as "$1", "$2", "$3" and so on. The...

What is pointer to string in C?

argv is an array of pointers pointing to zero terminated c-strings. I painted the following pretty picture to help you visualize something about the pointers. And here is a code example that shows you how an operating system would pass arguments to your...

How do mobile carriers know video resolution over HTTPS connections?

This is an active area of research. I happen to have done some work in this area, so I'll share what I can about the basic idea (this work was with industry partners and I can't share the secret details ๐Ÿ™‚ ). The tl;dr is that it's often possible to identify an...

How do I change the name of my Android device?

To change the hostname (device name) you have to use the terminal (as root): For Eclair (2.1): echo MYNAME > /proc/sys/kernel/hostname For Froyo (2.2): (works also on most 2.3) setprop net.hostname MYNAME Then restart your wi-fi. To see the change, type...

How does reverse SSH tunneling work?

I love explaining this kind of thing through visualization. ๐Ÿ™‚ Think of your SSH connections as tubes. Big tubes. Normally, you'll reach through these tubes to run a shell on a remote computer. The shell runs in a virtual terminal (tty). But you know this part...

Difference between database vs user vs schema

In Oracle, users and schemas are essentially the same thing. You can consider that a user is the account you use to connect to a database, and a schema is the set of objects (tables, views, etc.) that belong to that account. See this post on Stack Overflow:...

What’s the output of this code written in java?

//if you're using Eclipse, press ctrl-shift-f to "beautify" your code and make it easier to read int arr[] = new int[3]; //create a new array containing 3 elements for (int i = 0; i < 3; i++) { arr[i] = i;//assign each successive value of i to an entry in...

How safe are password managers like LastPass?

We should distinguish between offline password managers (like Password Safe) and online password managers (like LastPass). Offline password managers carry relatively little risk. It is true that the saved passwords are a single point of failure. But then, your...

Can anyone tell me why this program go to infinite times?

while (i <= 2) { while (i > 0) { a = a + b; i--; <- out the inner while loop when i = 0 } printf("%d", a); i++; <- at here, the i==0 each time, so infinity loop } Because your nested loop always restores the value of i to 0, And 0 <= 2 is always...

How to conditionally do something if a command succeeded or failed

How to conditionally do something if a command succeeded or failed That's exactly what bash's if statement does: if command ; then echo "Command succeeded" else echo "Command failed" fi Adding information from comments: you don't need to use the [ ... ] syntax...

How to turn JSON array into Postgres array?

Postgres 9.4 or newer Obviously inspired by this post, Postgres 9.4 added the missing function(s): Thanks to Laurence Rowe for the patch and Andrew Dunstan for committing! json_array_elements_text(json) jsonb_array_elements_text(jsonb) To unnest the JSON array....

Implementing a 2D destructible landscape (like Worms)

I don't know how the landscape in worms was implemented exactly, but I'm pretty sure they used a bitmap for the landscape (at least in the older games of the series). A very basic approach would be a bitmap image (B/W) where black pixels represent air and white...

Huge procedurally generated ‘wilderness’ worlds

I think I better understand what you are asking now. Noise is not random - it's random-looking but is completely based on a mathematical formula and is repeatable. All the information is encoded in the formula. This means that you can have a formula that...