Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / New
Stats: 3,150,669 members, 7,809,521 topics. Date: Friday, 26 April 2024 at 10:51 AM

Python: Explorative Data Analysis, Replacing Nan. Help! - Programming - Nairaland

Nairaland Forum / Science/Technology / Programming / Python: Explorative Data Analysis, Replacing Nan. Help! (396 Views)

Nairaland's February Front Page Data Analysis / Programming Not Enough Anymore: Data Analysis, Ml And A.i Is The Future. / English Premier League Players Data Analysis 2017/2018 Season (2) (3) (4)

(1) (Reply) (Go Down)

Python: Explorative Data Analysis, Replacing Nan. Help! by DasaintHope(m): 12:42pm On Nov 20, 2022
Good day I am new to data analysis and python I want to replace all NaN value in titanic dataset, specifically the age column.
I created a function:

def impute_age(cols):
age = cols[0]
pclass == cols[1]

if pd.isnull(age):

if pclass == 1
return 37

elif pclass == 2:
return 29

else:
return 24

else:
return age.


The output goes: Syntax Error: invalid syntax.
it indicated the last else was the issue. The video I am following the tutor did exactly this and had no error.
I would appreciate your input.
Re: Python: Explorative Data Analysis, Replacing Nan. Help! by Samuell2019(m): 4:58pm On Nov 20, 2022
DasaintHope:
Good day I am new to data analysis and python I want to replace all NaN value in titanic dataset, specifically the age column.
I created a function:

def impute_age(cols):
age = cols[0]
pclass == cols[1]

if pd.isnull(age):

if pclass == 1
return 37

elif pclass == 2:
return 29

else:
return 24

else:
return age.


The output goes: Syntax Error: invalid syntax.
it indicated the last else was the issue. The video I am following the tutor did exactly this and had no error.
I would appreciate your input.

The second to the last else should be elif instead.
Re: Python: Explorative Data Analysis, Replacing Nan. Help! by DasaintHope(m): 6:25pm On Nov 20, 2022
Samuell2019:


The second to the last else should be elif instead.
Thanks man.
modified: @Samuell2019 the problem is still persisting. it says the elif is incorrect syntax.
Re: Python: Explorative Data Analysis, Replacing Nan. Help! by semmyk(m): 6:38pm On Nov 20, 2022
Glancing through, seems you might have multiple error down the line.
First step, kindly check your indentations

Something like below. I'll try run with random value and feedback.
Updated: I see NL distorted the indents. I've manually insert "." to try and preserve indents. Kindly remove all the "."
Updated: I see NL continues to distort. I'll do a screenshot

def impute_age(cols):
age = cols[0] # assign first column to age Series
pclass = cols[1] # why compare operator instead of assign | pclass == cols[1]

if pd.isnull(age):
if pclass == 1: #integer
return 37
elif pclass == 2: #integer
return 29
else:
return 24
else:
return age

impute_age(cols_test)


DasaintHope:
Good day I am new to data analysis and python I want to replace all NaN value in titanic dataset, specifically the age column.
I created a function:

def impute_age(cols):
age = cols[0]
pclass == cols[1]

if pd.isnull(age):

if pclass == 1
return 37

elif pclass == 2:
return 29

else:
return 24

else:
return age.

The output goes: Syntax Error: invalid syntax.
it indicated the last else was the issue. The video I am following the tutor did exactly this and had no error.
I would appreciate your input.

Re: Python: Explorative Data Analysis, Replacing Nan. Help! by DasaintHope(m): 7:02pm On Nov 20, 2022
semmyk:
Glancing through, seems you might have multiple error down the line.
First step, kindly check your indentations

Something like below. I'll try run with random value and feedback.
Updated: I see NL distorted the indents. I've manually insert "." to try and preserve indents. Kindly remove all the "."
Updated: I see NL continues to distort. I'll do a screenshot

def impute_age(cols):
age = cols[0] # assign first column to age Series
pclass = cols[1] # why compare operator instead of assign | pclass == cols[1]

if pd.isnull(age):
if pclass == 1: #integer
return 37
elif pclass == 2: #integer
return 29
else:
return 24
else:
return age

impute_age(cols_test)


Thanks boss! If I understand correctly the issue is the = I put instead of == right?
Modified:
Boss the problem still persists. The issue is with the last else statement.
Re: Python: Explorative Data Analysis, Replacing Nan. Help! by semmyk(m): 9:54pm On Nov 20, 2022
ok. Noted sir.
If you could provide the exact error message. That will help. Typically, that's the easiest way to assist.

While at it, kindly take note that a code might not throw an error, but might still have a functional error. essentially, it runs but the outcome is not the intended.
In your case, the positioning of elif, else goes a long way in what the code would do.
See, for instance, the snapshot below. NB: this is based on assumption and generated data (which will be different from that of your trainer)!

DasaintHope:

Thanks boss! If I understand correctly the issue is the = I put instead of == right?
Modified:
Boss the problem still persists. The issue is with the last else statement.

In between, I'm not sure why the trainer used the example (s)he used.
Not so sure it's pythonic enough.

Re: Python: Explorative Data Analysis, Replacing Nan. Help! by semmyk(m): 10:05pm On Nov 20, 2022
@DasaintHope
[update] Intuitively, I just checked online now and realised that what you're working on (the Titanic dataset) apparently has many examples.

These two might be of assistance.
PS: going by them, then my assumption in my worked example is in line with what I assumed them to be.
NB: I could have even opted for np.where()

https://python-forum.io/thread-2758.html
https://www.kaggle.com/code/frtgnn/a-simple-guide-to-titanic-survival-classifier/notebook

All the best. Enjoy Python coding and ML the pythonic way.
Re: Python: Explorative Data Analysis, Replacing Nan. Help! by DasaintHope(m): 10:30pm On Nov 20, 2022
semmyk:
ok. Noted sir.
If you could provide the exact error message. That will help. Typically, that's the easiest way to assist.

While at it, kindly take note that a code might not throw an error, but might still have a functional error. essentially, it runs but the outcome is not the intended.
In your case, the positioning of elif, else goes a long way in what the code would do.
See, for instance, the snapshot below. NB: this is based on assumption and generated data (which will be different from that of your trainer)!



In between, I'm not sure why the trainer used the example (s)he used.
Not so sure it's pythonic enough.

Thanks bro I've gotten it. I shouldn't have made the indentation same for both else statements.
Re: Python: Explorative Data Analysis, Replacing Nan. Help! by semmyk(m): 8:13am On Nov 21, 2022
semmyk:
... ...
NB: I could have even opted for np.where()
... ...
So, I gave np.where a shot just now. Trick seems to be nested np.where (NB: two 'cascaded' np.where gave a wired output).

NB: NL doesn't preserve indentations.
NB: I used mybinder online from my tab.

## np.where | https://numpy.org/doc/stable/reference/generated/numpy.where.html
## import libraries
import pandas as pd
import numpy as np

## create data
A = np.arange(20, 30, 3) #This will create a list with 4 elements (needed to ensure at least one NaN)

#B = np.random.default_rng().integers(3, size=(4)) #Generate a 5 x 1 array* of ints between 0 and 2, inclusive:
B1 = np.arange(3,6)
B_add = np.random.default_rng().integers(1,3, size=(3))
B = np.append(B1, B_add) #concatenate or vstack for scalar
print(f'A: {A} | B: {B}')

## create dataframe, transposed, and check the created data
cols_test = pd.DataFrame([A,B]).T
cols_test[1] = cols_test[1].astype(int) #Ensure the pclass column is int
cols_test

## define function to check NaN/null age and assign age
def impute_age(cols):
age = cols[0] # assign first column to age Series
#print(f'age: {age}') #debug
age1 = cols.loc[:,0] # assign first column to age Series
#print(f'age1: {age1}') #debug
pclass = cols.loc[:,1] # why compare operator instead of assign | pclass == cols[1]

## nested np.where
age_null = np.where((cols[0].isnull()) & (cols[1]==1), 37,
np.where((cols[0].isnull()) & (cols[1]==2),
39, cols[0])).astype(int)

print(f'age col[0]: \n{age_null}')

impute_age(cols_test)

(1) (Reply)

Shopify Seems Easier Than Wordpress. / Looking For A New Way To Learn New Things / What Do I Need To Learn To Create Browsing Cheats

(Go Up)

Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health
religion celebs tv-movies music-radio literature webmasters programming techmarket

Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 32
Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.