Scientific libraries and tools
The following libraries and tools are recommended:
Python
Data analysis:
numpy,pandasLarger-than-memory data:
dask,polarsAccelerating Python loops:
numbaSpecialist analysis:
scipyStatistical modelling:
statsmodelsMachine learning:
scikit-learn,keras,tensorflow,tensorboard,pytorch,yellowbrickNatural language processing:
nltk,spacyGeospatial data:
geopandas,shapely,rasterio,rioxarray,cartopyVisualisation:
matplotlib,seaborn,altair,plotly,folium,geoplotDashboards:
streamlitProbabilistic programming:
pymcStorage of tabular data: Apache Parquet (via
pyarrowandfastparquet), HDF5 (viahdf5andh5py)Web scraping:
scrapy,beautifulsoup4,parsel,lxmlWeb development:
flask,djangoUI improvements:
rich,tqdmNotebooks:
jupyterlab(andnbdimefor Git integration)Testing:
pytestDocumentation:
sphinx,mkdocs
R
Data analysis:
tidyverse(includingdplyr,tidyr),data.table,sfVisualisation:
ggplot2Statistical modelling:
glm(built-in),brmsDashboards:
shinyDatabase connections:
odbc,dbplyrTesting:
testthatDocumentation:
pkgdownEnvironment management:
renv
Other tools
Markdown documents and websites:
quarto,juypterbookOnline analytical processing using SQL:
duckdb(withduckdbandjupysqlfor Python integration)Command line json processing:
jqMakefile-like pipelines:
just