Python Vocab 101

Charles Vanderbilt
6 min readNov 3, 2020

Learning a new programmming language can be daunting, and confusing because essentially you are learning a new way of talking to a computation machine, a.k.a the computer to do your bidding.

With that in mind, what’s a better way to learn new a language than to know its vocabulary (ie functions and methods).

We will be focusing on basic vocab that are often used for text processing, but I am pretty sure it will also be useful for other things, as it is quiet general

Importing and installing a package

Python is widely popular for its vast number of packages that will do pretty much eveything you need done. To use these packages in your environment simply call import, and if you do not have that particular package in your environment you can install it by calling pip install and/or conda install depending on your environment

%pip install spacy
%pip install pandas

In some environment, you might need to add “%” or “!” before the pip command

import pandas as pd
import matplotlib.pyplot as plt
import spacy

Putting “as” after the package name allows you to shorten the package when called.

For example, instead of using pandas.read_csv(), you can do pd.read_csv().

You can also do a selective import by using the format from Package Name import Method Name

from spacy import tokenizer

If a package has a hyphen on its name, you need to change it to an underscore when you import it

%pip install -U sentence-transformers
import sentence_transformers

Popular functions and methods

Python offers built-in functions and methods that are extermely useful that will make your life easier. Some of them are below.

A function has the following syntax — function(argument)

A method has the following syntax — argument.method()

print() will print out the argument given to the method

x = 'Hello World'
print(x)
Hello World

Using print() we can also quickly find out if certain values exist or not by using in argument

print('World' in x)True

type() will return the type of the object such as str for string, int for integer, bool for boolean, etc.

type(x)stry = True
type(y)
booldictionary = {'Country': 'Australia', 'Capital': 'Canberra'}
type(dictionary)
dict

len() will return the number of elements in an object

sentence = "The sun shines brighter than yesterday"
lists = [1,2,3,4]

print(len(sentence))
print(len(lists))
print(len(dictionary))
38
4
2

upper() method will make your string into an upper case

sentence.upper()'THE SUN SHINES BRIGHTER THAN YESTERDAY'

Consequently, lower() will make your string into lower case

sentence.lower()'the sun shines brighter than yesterday'

strip() will removes whitespaces from your string

a = "   I believe I can fly.   "
a.strip()
'I believe I can fly.'

split() will split your string into a list.

a.split()['I', 'believe', 'I', 'can', 'fly.']

append() will add an item at the end of the list.

a_list = a.split()
a_list.append('sky')

print(a_list)
['I', 'believe', 'I', 'can', 'fly.', 'sky']

remove() will remove the first item specified in the argument from the list.

a_list.remove('I')
print(a_list)
['believe', 'I', 'can', 'fly.', 'sky']

You can also use pop() to remove an item at the given position in the list. If no argument specified, it will remove the last item.

a_list.pop()
print(a_list)
['believe', 'I', 'can', 'fly.']a_list.pop(0)
print(a_list)
['I', 'can', 'fly.']

index() method returns the index value of the argument from the list. Python starts indexing at 0

a_list.index('can')1

Dictionary objects are made up of key-value pairs. You can get all the keys and values by calling keys() and values() respectively.

dictionary.keys()dict_keys(['Country', 'Capital'])dictionary.values()dict_values(['Australia', 'Canberra'])

Adding another value in the dictionary requires you to put in this syntax dictionary[‘key’] = ‘value’

dictionary['Language'] = 'English'
dictionary
{'Country': 'Australia', 'Capital': 'Canberra', 'Language': 'English'}

Depending on the object type, list() return different values. For string, it will return each characters, for a list, it will return each item, and dictionary, it will return the keys.

list(a)[' ',
' ',
' ',
'I',
' ',
'b',
'e',
'l',
'i',
'e',
'v',
'e',
' ',
'I',
' ',
'c',
'a',
'n',
' ',
'f',
'l',
'y',
'.',
' ',
' ',
' ']
list(a_list)['I', 'can', 'fly.']list(dictionary)['Country', 'Capital', 'Language']

Accessing values

Depending on the object type, you access the value of an object differently.

# Build a list
a = "I believe I can fly. I believe I can touch the sky"
a_list = a.split()
print(a_list, end="\n\n====\n\n")

# Build a dictionary inside dictionary
dictionary = {'Country': 'Australia' , 'Capital': 'Canberra', 'Language': 'English', 'Currency': 'Dollar'}
print(dictionary, end="\n\n====\n\n")

# Build a DataFrame

# Define lists
country = ['Australia','New Zealand','United Kingdom','Canada','Japan','South Korea','Singapore','India']
dollar = [True, True, False, True, False, False, True, False]

# Create a dictionary
new_dict = {'Country':country, 'Dollar Currency': dollar}

# Convert to DataFrame
df = pd.DataFrame(new_dict)

print(df)
['I', 'believe', 'I', 'can', 'fly.', 'I', 'believe', 'I', 'can', 'touch', 'the', 'sky']

====

{'Country': 'Australia', 'Capital': 'Canberra', 'Language': 'English', 'Currency': 'Dollar'}

====

Country Dollar Currency
0 Australia True
1 New Zealand True
2 United Kingdom False
3 Canada True
4 Japan False
5 South Korea False
6 Singapore True
7 India False

List

For list object, you simply use the syntax list[i] where i is the argument of an index number. Remember Python starts indexing with 0.

a_list[0]'I'a_list[3]'can'

Supplying negative values will return items from the end of the list.

a_list[-1]'sky'a_list[-4:-2]['can', 'touch']

When you supply a range, the last index is not counted. So if you say 2:5, it will return the item at index 2, 3, and 4.

a_list[2:5]['I', 'can', 'fly.']

Dictionary

For dictionary, you have to specify the keys to return the values of the corresponding keys, and it is case sensitive.

print(dictionary['Country'])
print(dictionary['Currency'])
Australia
Dollar

Data Frame

Data Frame requires you to give a range to return a value. Supplying one index number will not work. The syntax is the same as accessing a list object.

df[0:3]
png

If you want to use one index number, you will need to use method iloc()

df.iloc[[1]]
png

list() will return the column name of the data frame

list(df)['Country', 'Dollar Currency']

Using .shape after the data frame object will tell you the number of entries/values in the data frame. The second value is the number of column in the data frame.

Alternatively, you can use .info() to get a more comprehensive summary of the data frame.

df.shape(8, 2)df.info()<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 2 columns):
Country 8 non-null object
Dollar Currency 8 non-null bool
dtypes: bool(1), object(1)
memory usage: 152.0+ bytes

Another way to look at the data frame is by calling .T attribute to show the data frame in a transpose way.

df.T
png

Conclusion

One of the things I found important in learning a new programming language or any language in general is getting to know key words and word types/ parts of speech such as noun, verb, pronoun, etc. In the case of Python, you want to know the popular methods and functions to get you started as well as understanding basic object such as a list, dictionary, and data frame.

As any learning process, practice will help you getting used to these key concepts and discover more vocabs as you go along.

Thanks for reading!

--

--

Charles Vanderbilt
0 Followers

Product Manager, Infrastructure and Support Guy, and an aspiring Data Scientist