vault backup: 2026-01-27 15:17:32

This commit is contained in:
2026-01-27 15:17:32 +01:00
parent e51f022340
commit 9ca1a90014
24 changed files with 843 additions and 76 deletions

View File

@@ -10,7 +10,7 @@ Fehlende Daten:
- a22ad77be0d478b5fe34d1167d4dbb3a
- a96f7c881d27b739c38b81da2058a2fb
- c6679ce22dfd5646f81d40a8dbb0236b
Beispiele:
>[!danger]
@@ -56,40 +56,25 @@ Ideen:
```python
# Plot
fig, ax = plt.subplots(figsize=(8,5))
ax.set_title(
"Verteiltung Durchschn. Bruttomonatsverdienste ohne Sonderz. (nach Geschlecht)"
)
sb.kdeplot(
female, label="Weiblich",
color="#fe640b", fill=True,
alpha=0.5, ax=ax
)
sb.kdeplot(
male, label="Männlich",
color="#40a02b", fill=True,
alpha=0.5, ax=ax
)
ax.legend()
# Binäre Variable: Vegan = 1, Nicht-Vegan = 0
df['ist_vegan'] = (df['Ernährung'] == 'Vegan').astype(int)
text = f'''tstat: {t_stat:.2f}
pvalue: {p_value:.3f}
Pay Gap: {unbereinigter_gpg:.2f}%
Abs. Diff: {bruttoeinkommen_diff:.2f}€'''
# Filter: mindestens 2 Instagram-Accounts UND mindestens einer privat
h2_data = df[
(df['Instagram_Anzahl'] >= 2) & (df['Instagram_Privat'] == 'Ja')
]
ax.text(0.95, 0.75, text,
transform=ax.transAxes,
fontsize=10,
verticalalignment='top',
horizontalalignment='right',
bbox=dict(
facecolor='white',
edgecolor='black',
pad=5, alpha=0.8
)
)
plt.show()
# Gruppen
vegan_mit = h2_data['ist_vegan']
vegan_ohne = df[
(df['Instagram_Anzahl'] < 2) | (df['Instagram_Privat'] != 'Ja')
]['ist_vegan']
# Independent t-test
t_stat, p_value = ttest_ind(vegan_mit, vegan_ohne)
vegan_pct_mit = vegan_mit.mean() * 100
vegan_pct_ohne = vegan_ohne.mean() * 100
```
@@ -98,5 +83,11 @@ $$\text{Birth Rate} = \frac{B}{P} * 1000$$
# Projekte
- [Social Media Addiction](https://www.kaggle.com/datasets/adilshamim8/social-media-addiction-vs-relationships)
- [Movies & Shows](https://www.kaggle.com/datasets/shivamb/amazon-prime-movies-and-tv-shows)
- 8
Annemike Rörig anschreiben