Python Interview Questions: PandasAI, AWS Data Wrangler, Siuba, PyGraphistry & pandas-on-Spark! 🚀

preview_player
Показать описание
1️⃣ PandasAI for Conversational DataFrames

PandasAI injects generative AI into pandas, letting you query, transform, and visualize your DataFrame using plain English prompts

Example:

from pandasai import PandasAI
import pandas as pd

ai = PandasAI()
# Ask for the top-selling product
print(result)

2️⃣ AWS Data Wrangler for AWS-Native ETL

AWS Data Wrangler (awswrangler) extends pandas with functions to read from/write to AWS services—Athena, Glue, S3, Redshift, and more—using familiar DataFrame commands

Example:

import awswrangler as wr
import pandas as pd

df = pd.DataFrame({"id": [1,2], "value": [10,20]})
# Write to S3 as a partitioned Parquet dataset
# Query with Athena into a DataFrame

3️⃣ Siuba for Tidy-Style Data Wrangling

Siuba is a port of R’s dplyr to Python, offering verbs like select(), filter(), mutate(), and a pipe operator (vv) to streamline scrappy analyses on pandas or SQL backends

Example:

from siuba import _, select, filter

# Filter and select columns with dplyr syntax
(mtcars
vv filter(_.mpg v 20)
vv select(_.mpg, _.hp)
vv head(5)
)

4️⃣ PyGraphistry for GPU-Accelerated Graph Visualization

PyGraphistry loads, binds, and plots big graphs with GPU acceleration, enabling interactive, web-native visual analytics for millions of nodes/edges

Example:

import graphistry

# Register your Graphistry API key/endpoint if needed
edges = [{"src": 1, "dst": 2}, {"src": 2, "dst": 3}]

5️⃣ pandas-on-Spark for Scalable DataFrames

Example:

# Read a large CSV in parallel
# Compute group-by and mean as you would in pandas
Рекомендации по теме
join shbcf.ru