Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / New
Stats: 3,150,681 members, 7,809,579 topics. Date: Friday, 26 April 2024 at 11:35 AM

Chronicle Of A Data Scientist/analyst - Programming (57) - Nairaland

Nairaland Forum / Science/Technology / Programming / Chronicle Of A Data Scientist/analyst (330300 Views)

Chronicle Of A Data/cloud Engineer / Net Salary For A Data Analyst Or Scientist Or Web Dev / Aspiring Data Scientist. (2) (3) (4)

(1) (2) (3) ... (54) (55) (56) (57) (58) (59) (60) ... (146) (Reply) (Go Down)

Re: Chronicle Of A Data Scientist/analyst by saheedniyi22(m): 12:57pm On Jul 10, 2020
Samzeal:
hello house, please how i do data cleaning with my dataset set
Depends on the tools you use,
I use pandas, my main aim of cleaning is to remove missing values and convert all categorical features to numerical features, it depends on your aim with the data.
Re: Chronicle Of A Data Scientist/analyst by jiggyniga: 1:33pm On Jul 10, 2020
KlausMichaelson:
Good evening House. I've been following this thread for a while now. I'm a lover of Data analysis.

I'm in my final year, an Engineering student. Sooner than later when this pandemic is over, I'll be analysing my project report. I'll be using some statistical analysis like ANOVA table and others to analyse my Data.

Now I will consider myself an intermediate in Python and in Excel. Also I'm a beginner in Pandas. I'm yet to learn Sql, MySql, Matplotlib, seaborne, etc. I only have gotten an intro into Pandas and it's really cool and interesting. I'll soon be done with it.


My Questions are
(1). Which among these; Pandas, Sql, Mysql, Matplotlib, seaborne among others, should I master so as to have a smooth ride in analyzing my project results. Which one also applies ANOVA table??

(2). Which of them aside Pandas (cos I've already started learning Pandas), should I start learning immediately I master Pandas? I'm very conscious of time. I'll really love to use this little time we have to master the more important ones for final year result Analysis. I want to make money from it when school resumes. I want to Master the ones I can so that I can help provide Analysis on other students project results and get paid for it.


Pls I need your Candid responses. Thank you in advance

I think SPSS can effectively handle most data and analysis thrown at it in the academic realm but if you are considering using python scipy and researchpy libraries could also do the same analysis but graphs in SPSS kinda suck so you can learn mathplotlib,seaborn and probably plotly for data visualization if you need something higher.
But overall I think a "theoretical" in-depth knowledge of statistics is the foundation of analysis so you know why and what to do the tools are just there to enhance everything.
Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 5:21pm On Jul 10, 2020
jiggyniga:


I think SPSS can effectively handle most data and analysis thrown at it in the academic realm but if you are considering using python scipy and researchpy libraries could also do the same analysis but graphs in SPSS kinda suck so you can learn mathplotlib,seaborn and probably plotly for data visualization if you need something higher.
But overall I think a "theoretical" in-depth knowledge of statistics is the foundation of analysis so you know why and what to do the tools are just there to enhance everything.


Good day sir and thanks for your response. I really appreciate it.

Sir is SPSS a library in Jupyter notebook or it's a software entirely on its own?

Why Is pandas not enough for the job??
Re: Chronicle Of A Data Scientist/analyst by jiggyniga: 6:47pm On Jul 10, 2020
KlausMichaelson:



Good day sir and thanks for your response. I really appreciate it.

Sir is SPSS a library in Jupyter notebook or it's a software entirely on its own?

Why Is pandas not enough for the job??


Yes it's another software on it's own. Pandas is meant for dataframe wrangling majorly although you can plot with it but if you need to use ANOVA you need the scipy library to do it.
Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 7:21pm On Jul 10, 2020
jiggyniga:


Yes it's another software on it's own. Pandas is meant for dataframe wrangling majorly although you can plot with it but if you need to use ANOVA you need the scipy library to do it.

Wow! Thanks a lot sir!

1 Like

Re: Chronicle Of A Data Scientist/analyst by hardytech: 7:22pm On Jul 10, 2020
good evening, please this goes to all expert here, how well do you guys rate Andrei zero to mastery course on machine learning and data science, considering the what he has created good courses on web development, I would appreciate swift response and input.
Re: Chronicle Of A Data Scientist/analyst by Samzeal(m): 10:57pm On Jul 10, 2020
saheedniyi22:

Depends on the tools you use,
I use pandas, my main aim of cleaning is to remove missing values p convert all categorical features to numerical features, it depends on your aim with the data.

Thank you having been able to fixed it
Re: Chronicle Of A Data Scientist/analyst by Shepherdd(m): 12:16am On Jul 11, 2020
saheedniyi22:


Thanks for this,

I have been trying to scrape a certain car selling website (not like nairaland that load all its details once) but the website is such that as you scroll down it keeps bringing information for new cars. When I tried to scrape it, it only gave me the information for 10 cars alone.
Is there a way to do it that it will give me more cars information.


For pages injected with JavaScript, Selenium is the main tool here. You will need to combine Selenium with infinite scrolling (it will stop once it reaches end of page). A good approach is to scroll to the end of the page first and once everything is loaded in the DOM you can grab them and send them to BeautifulSoup for parsing.

4 Likes 1 Share

Re: Chronicle Of A Data Scientist/analyst by Dum20: 5:49am On Jul 11, 2020
Hello Guys,

As i mentioned earlier. Below is another one from fiverr. The budget is $108. My objective is purely for us to discuss and for people to see real life data science problems that people pay for to be solved for them

I do not even know how to solve this one. I hope out gurus in the house can use it to teach us. Just give us an idea of how to approach it.


>>>>> hi Everybody, I looking for someone for Machine Learning sequence prediction models who are not afraid from numbers & big numbers, So the work will be analyse a file with more then 32000 results of raffles. Each raffle is include 4 cards, Heart, Diamond, Clubs and Leaf. So in each raffle there are result for each cards, means 4 results. Now I think that on this file there is something that return and based on old results new result is coming. There is not pattern or something, you should find something like that. After we success to predict the next raffle, I need it automate, so every time I will load the new CSV it will predict the next results. Do you think you can handle this big file and try to found some algorithm or solution to get high accuracy of the results in next raffle? I've attached here the CSV file with the results. I also know the next raffle result, so let me know if you can predict the results. What I need from you is a "demo" or predict the next raffle results and we compare the results with the real results. Regards.<<<<<
Re: Chronicle Of A Data Scientist/analyst by Shepherdd(m): 2:27pm On Jul 11, 2020
.
Re: Chronicle Of A Data Scientist/analyst by kunleiky(m): 2:31pm On Jul 11, 2020
yemyke001:


It means when you divide data['rowID'] by 10 the remainder will be 0
Thanks to you and @Dum20 that posted this question. I actually tried this today and it worked. I applied it to a Fortune 500 companies dataset. Let's keep the spirit flying here. Kudos to @Ejiod and everyone in here.
Re: Chronicle Of A Data Scientist/analyst by kunleiky(m): 2:50pm On Jul 11, 2020
iCode2:
What am I doing wrong?

Just put r immediately after the opening bracket such that it reads:
data= pd.ead_csv(r"C:\users blah blah blah...)
That should solve it.
Re: Chronicle Of A Data Scientist/analyst by saheedniyi22(m): 3:58pm On Jul 11, 2020
Shepherdd:


For pages injected with JavaScript, Selenium is the main tool here. You will need to combine Selenium with infinite scrolling (it will stop once it reaches end of page). A good approach is to scroll to the end of the page first and once everything is loaded in the DOM you can grab them and send them to BeautifulSoup for parsing.

What do you mean by Dom??
Re: Chronicle Of A Data Scientist/analyst by hardytech: 11:43pm On Jul 11, 2020
hardytech:
good evening, please this goes to all expert here, how well do you guys rate Andrei zero to mastery course on machine learning and data science, considering the what he has created good courses on web development, I would appreciate swift response and input.
still waiting for a reply from experienced guys here.
Re: Chronicle Of A Data Scientist/analyst by Zabiboy: 10:05am On Jul 12, 2020
hardytech:

still waiting for a reply from experienced guys here.

They are so many videos on ML man..
If you feel the guy is good, Trust your GUT and go ahead....
Most of us used different videos to learn
That's why we are here....to share ideas on aeea's where our tutors ignored
GL cool

1 Like

Re: Chronicle Of A Data Scientist/analyst by Shepherdd(m): 2:33pm On Jul 12, 2020
saheedniyi22:


What do you mean by Dom??

Document Object Model. That's where the browser keeps the HTML tree.
Re: Chronicle Of A Data Scientist/analyst by Shepherdd(m): 2:33pm On Jul 12, 2020
Dum20:
Hello Guys,

As i mentioned earlier. Below is another one from fiverr. The budget is $108. My objective is purely for us to discuss and for people to see real life data science problems that people pay for to be solved for them

I do not even know how to solve this one. I hope out gurus in the house can use it to teach us. Just give us an idea of how to approach it.


>>>>> hi Everybody, I looking for someone for Machine Learning sequence prediction models who are not afraid from numbers & big numbers, So the work will be analyse a file with more then 32000 results of raffles. Each raffle is include 4 cards, Heart, Diamond, Clubs and Leaf. So in each raffle there are result for each cards, means 4 results. Now I think that on this file there is something that return and based on old results new result is coming. There is not pattern or something, you should find something like that. After we success to predict the next raffle, I need it automate, so every time I will load the new CSV it will predict the next results. Do you think you can handle this big file and try to found some algorithm or solution to get high accuracy of the results in next raffle? I've attached here the CSV file with the results. I also know the next raffle result, so let me know if you can predict the results. What I need from you is a "demo" or predict the next raffle results and we compare the results with the real results. Regards.<<<<<


The problem is about sequence predictions. You take an ngram of raffles and you predict the next one. Since you have four classes for each raffle, IMHO you can have four models for each card i.e a model handles all diamonds card history etc.

Once you are done with training the four models, you can then create a function that abstracts the models, takes four cards, distribute them to the models and outputs the predictions.

3 Likes

Re: Chronicle Of A Data Scientist/analyst by ibromodzi: 8:12pm On Jul 12, 2020
KlausMichaelson:
Good evening House. I've been following this thread for a while now. I'm a lover of Data analysis.

I'm in my final year, an Engineering student. Sooner than later when this pandemic is over, I'll be analysing my project report. I'll be using some statistical analysis like ANOVA table and others to analyse my Data.

Now I will consider myself an intermediate in Python and in Excel. Also I'm a beginner in Pandas. I'm yet to learn Sql, MySql, Matplotlib, seaborne, etc. I only have gotten an intro into Pandas and it's really cool and interesting. I'll soon be done with it.


My Questions are
(1). Which among these; Pandas, Sql, Mysql, Matplotlib, seaborne among others, should I master so as to have a smooth ride in analyzing my project results. Which one also applies ANOVA table??

(2). Which of them aside Pandas (cos I've already started learning Pandas), should I start learning immediately I master Pandas? I'm very conscious of time. I'll really love to use this little time we have to master the more important ones for final year result Analysis. I want to make money from it when school resumes. I want to Master the ones I can so that I can help provide Analysis on other students project results and get paid for it.


Pls I need your Candid responses. Thank you in advance

If you are comfortable using python, you can achieve most of these tasks with it (just learn the relevant libraries)... However, the deeper you go into data analysis/science, the more you discover different tools and what they are best good at. My advice is that whatever tool you are learning, learn it well and practice with it.

For data manipulation, pandas is a must. For visualization, you need matplotlib and/seaborn. I highly recommend plotly for fanciful charts. I don't think you'll be needing SQL for now since you are not going to be interacting with DB.
Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 8:35pm On Jul 12, 2020
ibromodzi:


If you are comfortable using python, you can achieve most of these tasks with it (just learn the relevant libraries)... However, the deeper you go into data analysis/science, the more you discover different tools and what they are best good at. My advice is that whatever tool you are learning, learn it well and practice with it.

For data manipulation, pandas is a must. For visualization, you need matplotlib and/seaborn. I highly recommend plotly for fanciful charts. I don't think you'll be needing SQL for now since you are not going to be interacting with DB.


Wow. I really appreciate your Candid response. I'm already getting to the end of Pandas learning. Tho I'll keep practicing because I can see it's a must.

Now like you said, my next target will be Matplotlib.

Please which one is plotly again ;(
And what is DB( Mongo DB??)
Re: Chronicle Of A Data Scientist/analyst by Zabiboy: 8:44pm On Jul 12, 2020
KlausMichaelson:



Wow. I really appreciate your Candid response. I'm already getting to the end of Pandas learning. Tho I'll keep practicing because I can see it's a must.

Now like you said, my next target will be Matplotlib.

Please which one is plotly again ;(
And what is DB( Mongo DB??)

grin ...
Plotly is similar to matplotlib for plotting charts ...
Just take it one at a time..
DB means DataBase
Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 8:50pm On Jul 12, 2020
Zabiboy:
grin ... Plotly is similar to matplotlib for plotting charts ... Just take it one at a time.. DB means DataBase
Oh now I get smiley Thank you sir. I appreciate
Re: Chronicle Of A Data Scientist/analyst by hardytech: 12:20am On Jul 13, 2020
Zabiboy:
They are so many videos on ML man.. If you feel the guy is good, Trust your GUT and go ahead.... Most of us used different videos to learn That's why we are here....to share ideas on aeea's where our tutors ignored GL cool
thanks boss
Re: Chronicle Of A Data Scientist/analyst by peterincredible: 12:45am On Jul 13, 2020
hardytech:

thanks boss
am currently taking the course i am at the last chapter of the course on deep learning using google colab.
i will rate the course 4.5/5 it is a very good course the machine learning path was majorly taken by daniel bourke i will advice u to take the course
Re: Chronicle Of A Data Scientist/analyst by blife2: 12:50pm On Jul 13, 2020
please am want to start data science 17 year old have a laptop of 4gig ram where should i start
i started about 20days ago but need to reevaluate and ml also what resources
Re: Chronicle Of A Data Scientist/analyst by Zabiboy: 12:53pm On Jul 13, 2020
blife2:
please am want to start data science 17 year old have a laptop of 4gig ram where should i start
..... i started about 20days ago but need to reevaluate and ml also what resources

angry
What did you start 20 days ago??
Re-phrase your question
Re: Chronicle Of A Data Scientist/analyst by iCode2: 2:18pm On Jul 13, 2020
kunleiky:


Just put r immediately after the opening bracket such that it reads:
data= pd.ead_csv(r"C:\users blah blah blah...)
That should solve it.
It was an excel file. That's where the mistake was coming from. Thanks a lot.

Cc: Olamyyde
yemyke001
Oddy16
kunleiky
Zabiboy
Re: Chronicle Of A Data Scientist/analyst by hardytech: 2:33pm On Jul 13, 2020
peterincredible:
am currently taking the course i am at the last chapter of the course on deep learning using google colab.
i will rate the course 4.5/5 it is a very good course the machine learning path was majorly taken by daniel bourke i will advice u to take the course
wow finally, really appreciate your response, what do you plan on doing after the course?
Re: Chronicle Of A Data Scientist/analyst by whizqueen(f): 3:40pm On Jul 13, 2020
.

5 Likes

Re: Chronicle Of A Data Scientist/analyst by saheedniyi22(m): 4:07pm On Jul 13, 2020
*Day 6 of Day 70 prebootcamp class*

Hello guys,

Get to participate and learn the fundamentals of machine learning/ deep learning by registering so that you can get an invite link to classroom here. https:///prebootcamp2020 . Ensure you select your preferable stream of learning.

All intuitive discussion and concept breakdown takes place via our slack channel *#machine-learning-stream-2020* *#deep-learning-stream-2020.* Get to engage in a discussion with other ML enthusiasts by registering to get your membership ID in few minutes. https:///aimembership

Ensure you tweet your experience at us by tagging us @DataScienceNIG, and using the following hashtags #70daysofML #70daysofDL

Cheers.

1 Like

Re: Chronicle Of A Data Scientist/analyst by saheedniyi22(m): 4:11pm On Jul 13, 2020
saheedniyi22:
*Day 6 of Day 70 prebootcamp class*

Hello guys,

Get to participate and learn the fundamentals of machine learning/ deep learning by registering so that you can get an invite link to classroom here. https:///prebootcamp2020 . Ensure you select your preferable stream of learning.

All intuitive discussion and concept breakdown takes place via our slack channel *#machine-learning-stream-2020* *#deep-learning-stream-2020.* Get to engage in a discussion with other ML enthusiasts by registering to get your membership ID in few minutes. https:///aimembership

Ensure you tweet your experience at us by tagging us @DataScienceNIG, and using the following hashtags #70daysofML #70daysofDL

Cheers.

This is day 13 though, I'm sharing it late but you can still catch up though

1 Like

Re: Chronicle Of A Data Scientist/analyst by blife2: 12:14am On Jul 14, 2020
Zabiboy:


angry
What did you start 20 days ago??
Re-phrase your question
20 days since i started looking at data science data science in general
Re: Chronicle Of A Data Scientist/analyst by Joshgrey: 7:44am On Jul 14, 2020
mcemmy0z:
If you reside around songo Ota I have these available
*Udemy - Beginner to Pro in Excel Financial Modeling and Valuation
*Udemy - SQL - MySQL for Data Analytics and Business Intelligence
*Tableau 10 A-Z Hands-On Tableau Training For Data Science!
*Tableau Hands-on Learn Data Visualization with Tableau
*Udemy - Power BI A-Z Hands-On Power BI Training For Data Science
*Udemy - Machine Learning A-Z™ Hands-On Python & R In Data Science
*Udemy - Python for Financial Analysis and Algorithmic Trading



How can I get this please. Am interested

(1) (2) (3) ... (54) (55) (56) (57) (58) (59) (60) ... (146) (Reply)

I Want To Learn Programming. Which Language Should I Start With?

(Go Up)

Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health
religion celebs tv-movies music-radio literature webmasters programming techmarket

Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 71
Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.