Skip to content

Files

Latest commit

1dde4c4 · Jan 16, 2021

History

History
153 lines (88 loc) · 5.13 KB

401read11.md

File metadata and controls

153 lines (88 loc) · 5.13 KB

What is Jupyter?

A web-based user interface.

JupyterLab enables you to work with documents and activities such as Jupyter notebooks, text editors, terminals, and custom components in a flexible, integrated, and extensible manner.

It provides a code console to run code. Allows code in any text to be run in any Jupyter kernel. Has a bunch of libraries to access.

Command mode is for navigation.

Hot keys: Add a cell above with a, below with b, copy a cell with c, cut with x, paste with v, delete with pressing d twice, undo with z, redo with shift z.

Change the format of a cell to code with y and markdown with m.

Edit mode allows you to edit the contents of the cell. You must press shift enter to save the cell.

When you are coding and press shift enter it will run the code inside the cell. Since this is within the same Kernel all the code will be saved for instance:

In cell 1:

x = 4

In cell 2:

y = x + 5
print(output: y)

output: 9

Cell 2 will reference x's earlier assignment from the previous cell.

To reset the project press 0 twice.

JupyterLab was created in Feb 2018 and continues to update.

NumPy

NumPy is a commonly used Python data analysis package.

NumPy was originally developed in the mid 2000s. This longevity means that almost every data analysis or machine learning package for Python leverages NumPy in some way.

Easy way to put csv stuff into a list that can be used with NumPy:

import csv
with open('myfile.csv', 'r') as f:
    info = list(csv.reader(f, delimiter=';'))

print(info[:3])

Import csv, open the file with the reader object. Next by giving the keyword delimiter=';' it will split the info at the semicolon (This can be replaced with comma). Place that in a list type and assign it.

the print statement will print the first 3 rows.

Now you can manipulate and convert the data so it will work with NumPy. Keep in mind you might need to convert strings to floats or ints.

2-Dimensional Arrys

With NumPy, we work with multidimensional arrays. A 2-dimensional array is also known as a matrix

A matrix has rows and columns. By specifying a row number and a column number, we’re able to extract an element from a matrix.

Creating A NumPy Array

Start with importing numpy as np

You must use the numpy.array function. If we pass in a list of lists, it will automatically create a NumPy array with the same number of rows and columns.

(using the code from above)

import numpy as np

info = np.array(info[1:], dtype=np.float)

If you print/display the variable info you'll see a NumPy array.

You can also use the shape property to see the number of rows and columns.

There are more than one way to create NumPy array.

You can also create an array where each element is a random number using numpy.random.rand. Here’s an example:

np.random.rand(3,4)

array([[ 0.2247223 , 0.92240549, 0.14541893, 0.61731257],
[ 0.00154957, 0.82342197, 0.74044906, 0.11466845],
[ 0.6152478 , 0.14433138, 0.13009583, 0.22981301]])

Notice it made 3 arrays, and 4 floats inside the arrays.

NumPy Data Types

To see what kind of data type the NumPy array is:

info.dtype

  • float — numeric floating point data.
  • int — integer data.
  • string — character data.
  • object — Python objects.

Converting Data Types

You can use the numpy.ndarray.astype method to convert an array to a different type. The method will actually copy the array, and return a new array with the specified data type.

info.astype(int)

Math with NumPy

If you do any of the basic mathematical operations (/, *, -, +, ^) with an array and a value, it will apply the operation to each of the elements in the array.

Multiple Array Math

All of the common operations (/, *, -, +, ^) will work between arrays. It’s also possible to do mathematical operations between arrays. This will apply the operation to pairs of elements.

Broadcasting

Unless the arrays that you’re operating on are the exact same size, it’s not possible to do elementwise operations. In cases like this, NumPy performs broadcasting to try to match up elements.

The last dimension of each array is compared. If the dimension lengths are equal, or one of the dimensions is of length 1, then we keep going. If the dimension lengths aren’t equal, and none of the dimensions have length 1, then there’s an error.

NumPy Array Methods

In addition to the common mathematical operations, NumPy also has several methods that you can use for more complex calculations on arrays.

info[:,11].sum()

You will get the sum of all that column.

  • numpy.ndarray.mean — finds the mean of an array.
  • numpy.ndarray.std — finds the standard deviation of an array.
  • numpy.ndarray.min — finds the minimum value in an array.
  • numpy.ndarray.max — finds the maximum value in an array.

NumPy Array Comparisons

NumPy makes it possible to test to see if rows match certain values using mathematical comparison operations like <, >, >=, <=, and ==

Subsetting

One of the powerful things we can do with a Boolean array and a NumPy array is select only certain rows or columns in the NumPy array.

Link to NumPy cheat sheet