Tech Notes for Memory: 2017

Wednesday, March 29, 2017

Working with zipfile module Python

Memo related to zipfile module and most common methods from this module.

import zipfile

compressedFile = zipfile.ZipFile('test.zip')

compressedFile.namelist() - returns list of files and directories in archive

fileInfo = compressedFile.getinfo('file') - returns information about file in archive

fileInfo.file_size - returns size of the file

fileInfo.compress_size - returns compression ratio

compressedFile.extractall() - extracts everything from zip into current directory

compressedFile.extractall(path) - extracts everything from zip to destination directory

compressedFile.extract('file', path) - extracts exact file from zip to destination path

newZip = zipfile.ZipFile('test.zip', 'w')
newZip.write('test.txt', compress_type=zipfile.ZIP_DEFLATED)
newZip.close() - adding file to archive however this will overrides all archive

zipfile.ZipFile('test.zip', 'a') - adding file to archive without overriding

Working with files in Python (part 2) - shutil, move2trash, os

In this memo I'll save some common methods related to shutil, move2trash modules and some more from os module.

shutil.

import shutil

shutil.copy('source', 'destination') - copies file or directory

shutil.copytree('source', 'destination') - copies directory with all subdirectories & files

shutil.move('source', 'destination') - moves file or directory

shutil.remtree(path) - remove directory with all subdirectories

send2trash.

import send2trash

send2trash.send2trash(path) - moves file or directory to the bin

os.

import os

os.walk(path) - returns list of all directories, subdirectories and files

os.unlink(path) - permanently deletes file

os.rmdir(path) - permanently removes directory

Monday, March 27, 2017

Working with files in Python

Some notes about working with files in Python.

How simply generate path:

import os
os.path.join('usr', 'user', 'Documents')

or if we need path to file:

os.path.join('home/Documents/work', important.txt)

os.getcwd() - check current directory

os.chdir('path to new working directory') - change directory

os.makedirs('home/docs/test') - create directory

os.path.abspath('path') - get absolute path

os.path.relpath('path', 'start point') - get relative path

os.path.dirname('path') - get directory name actually will return name with whole path

os.path.basename('path') - get file name

os.path.split('path') - to get both directory name and file name as a tuple

or

examplePath.split(os.path.sep) - will return tuple where every directory and file name are separate strings (in OSX first item in the list will be '')

os.path.listdir('path') - list all items in the directory

os.path.getsize('path') - get file size in bytes

os.path.exists('path') - returns True if directory or file exists

os.path.isfile('path') - returns True if file exists

os.path.isdir('path') - returns True if directory exists

open('path') - returns file object if there's no such file Python will create that

read('path') or open('path', 'r')- reads file

file.readlines() - reads line by line from file

write('path') or open('path', 'w')- writes to file (simple usage will overwrite all content)

open('path', 'a') - will append text to the end of the file instead of overwriting

close('path') - closes file

Work with binary shelf files (can be used as dictionaries etc.):

import shelve
testFile = shelve.open('filename')
testValues = ['email', 'address', 'phone']
testFile['data'] = testValues
testFile.close()

testFile('data') - prints out values

list(testFile.keys()) - prints list of keys in our case it will be just 'data'

list(testFile.values()) - prints list of values ie ['email', 'address', 'phone']

import pprint
pprint.pformat(list) - will represent list / dictionary as a string for easier saving that to a file

And for the end short reminder about shelf files and plain text files: first used mostly for complex objects (ie file objects) while plain text files can be used for storing simple data (ie text, integers etc)

Sunday, March 26, 2017

Regex rules for Python...

Something to keep in mind about Regex.

? - matches zero or one of the preceding group
* - matches zero or more of the preceding group
+ - matches one or more of the preceding group
{n} - matches exactly n of the preceding group
{n,} - matches n or more of the preceding group
{,m} - matches 0 to m of the preceding group
{n,m} - matches at least n and at most m of the preceding group
{n,m}? or *? or +? - performs a nongreedy match of the preceding group
^spam - means the string must begin with spam
spam$ - means the string must end with spam
. - matches any character, except newline characters
\d, \w and \s - match a digit, word, or space character
\D, \W and \S - match anything except a digit, word or space character
[abc] - matches any character between the brackets
[^abc] - matches any character that isn't between brackets

In order to use package re should be imported after that compile method can be invoked:

regex = re.compile(r'spam')

Some useful methods:

regex.search('abc') - returns first occurrence of matched pattern
regex.group(1) - returns first group in case of multiple of those ie re.compile(r'(\d\d)-(\d\d\d))
regex.group() - returns the entire matched text
regex.groups() - returns tuple of multiple values
regex.findall() - returns the strings of every match in the searched string ie list of strings if there's more than 1 match
regex.sub() - takes two arguments: first string for replacement and second string where to replace (in case of match) ie regex.sub(r'abc', 'abc is the most popular shortcut')

List is not complete and I will try to add some more examples later...

Math calculation with Bash in OSX

There are quite a few options how to calculate with Bash in OSX terminal but for reference I'd like to mention just several of them which can be used as a baseline.

echo "2 * 2" | bc

echo "scale=3; 1000/52 | bc" - will output result + 3 digits after comma

bc <<< "100 / 3"

bc -l <<< "sqrt(5)" (scale(3.3) - number of digits after comma, length(100) - number of digits)

awk "BEGIN {print 33 * 33}"