Pandas and Numpy syntax Cheat Sheet

Norman Fung
2 min readMar 30, 2024

--

Pandas

Basically think of all the things you can do with SQL: SELECT, INSERT, UPDATE/DELETE, GROUP BY. How do you perform these operations with pandas?

  • Basic data loading via read_json, basic operations: shift, diff, pct_change, cumprod, pd.to_datetime to convert date strings to actual datetime
This is the dataset, after you read the json file.
  • Time Series operations: mean/std with rolling vs ewm (Exponential Moving Averages)
  • Basic filter: AND, OR, Negation, isnull and isna
  • Filtering on index columns
  • Sort, by different fields and some Asc, other Desc
  • Nan handling: fillna and replace
  • apply and Lambda transformation
  • Aggregation
  • Add one row to data frame
  • Modify a field of a given row, avoiding: “A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead.”
  • Column to numpy.ndarray and to simple list
  • The Basic Loop: iterrows
  • Pretty print your pandas: tabulate

OK, see this one short example!

Sample equity files? I will include one of the json files here for sake of convenience.

Numpy

  • Array initialization: ones/zeros/arrange/linspace/randn
  • reshape
  • min/max/argmin/argmax
  • Slicing
  • filtering
  • Array Operations
  • Sum across rows or columns
  • statistics: min/max/mean/var/std
  • Broadcasting

Now, if we have Pandas why we need numpy?

  • Pandas is built on top of numpy, a ton of features such as data loading /export to/from csv/json, to manipulation: SELECT/WHERE/GROUPBY/INSERT/UPDATE
  • Many scientific libraries uses numpy.ndarray

For this simplest keras hello world example, you can see keras fit and predict takes np.array as input data

--

--