Welcome, Guest: Register On Nairaland / LOGIN! / Trending / Recent / New
Stats: 3,150,000 members, 7,806,913 topics. Date: Wednesday, 24 April 2024 at 06:48 AM

Chronicle Of A Data Scientist/analyst - Programming (61) - Nairaland

Nairaland Forum / Science/Technology / Programming / Chronicle Of A Data Scientist/analyst (329851 Views)

Chronicle Of A Data/cloud Engineer / Net Salary For A Data Analyst Or Scientist Or Web Dev / Aspiring Data Scientist. (2) (3) (4)

(1) (2) (3) ... (58) (59) (60) (61) (62) (63) (64) ... (146) (Reply) (Go Down)

Re: Chronicle Of A Data Scientist/analyst by Samzeal(m): 3:39am On Jul 28, 2020
am having problem in installing mysql, saying mysql workbench requires the visual c++ 2019 redistributable package to be installed. please how can i go about this.
Re: Chronicle Of A Data Scientist/analyst by wisemania(m): 3:57am On Jul 28, 2020
HewlettPackard:


Sorry about your health, quick recovery.

What materials are you using to study SQL, I am good with writing basic queries, I need to advance my skill, can you help in this regard.


please this goes out to other posters too.

https://drive.google.com/uc?id=1riG_93vaoPUG81V7f9dq2E95LAnhcTkc&export=download

https://www.freetutorialsus.com/sql-mysql-for-data-analytics-and-business-intelligence-udemy-course-free-download/


The first teaches design while the second teaches you manipulation


Take your time.

3 Likes 1 Share

Re: Chronicle Of A Data Scientist/analyst by wisemania(m): 3:58am On Jul 28, 2020
Samzeal:
am having problem in installing mysql, saying mysql workbench requires the visual c++ 2019 redistributable package to be installed. please how can i go about this.

https://www.freetutorialsus.com/sql-mysql-for-data-analytics-and-business-intelligence-udemy-course-free-download/

This tutorial shows you how to go about it.
Re: Chronicle Of A Data Scientist/analyst by Dahyormine(m): 10:46am On Jul 28, 2020
Samzeal:

am interested
PM
Re: Chronicle Of A Data Scientist/analyst by Henry651(m): 10:49am On Jul 28, 2020
Dahyormine:
PM
This is odd I know but am new here and I don’t know how I can upload a story here .
Re: Chronicle Of A Data Scientist/analyst by Dum20: 1:50pm On Jul 28, 2020
[quote author=KlausMichaelson post=92136556]

Thats very good. Keep it up sir. Yes you can check out *CS DOJO* and *Keith Galli* on YouTube. Check for their Videos on Intro to Data Science and Visualization, pandas, pyplot, Matplotlib, etc. They will open your eyes.

Modified
I use my book to take down every step about just any of the tools I'm learning from any of the videos. I literally have a jotter where I took down every steps. I do this cos I'll still go through my jottinngs whenever I'm not on my system. So It requires patience.[/quote


Bro thank you for this. I am really gaining some insights and understanding the use of Python in Data Science. I am going through the Keith Galli video on "Solving Real world data science tasks with Python". It looks cool.

But i observed that what is being done using Pandas and Matplolib can be done comfortably with excel(Power Queries)

To me Excel is less cumbersome than typing lines of code in Python

So i want to ask ogas in the house why they say they prefer Python.

Looking forward to the discussion

1 Like

Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 3:40pm On Jul 28, 2020
[quote author=Dum20 post=92198951][/quote]




Good day Sir to answer your question below:

**But i observed that what is being done using Pandas and Matplolib can be done comfortably with excel(Power Queries)**

As an intermediate, I'll like to let you know why pandas is much more flexible than Excel.
Here are my reasons

(1). Using Pandas, you can easily create a new cell(or column) that could do a multiple computations(addition, Subtraction, formulae etc) using values from other cells(or columns) in just a single line of code.
But Excel is limited to that as you can only
do it for each cell in a new column one by one and this can very cumbersome.

(2). Excel doesn't help give you a visualization of your preferred taste. That doesn't mean you can't design a plot( it's axis, Gradient line, ticks and the likes). But Pandas is more flexible using Matplotlib and pyplot, you can easily give your plot a good visualization of your choice.You can easily change marker colour, line size, line colour and the likes by just a simple line of code(even shorthand notation)

There are other reasons why I think Pandas/Matplotlib/Pyplot is more flexible than Excel but those are the ones I can remember at the moment. Although I stand to be corrected grin cos I'm still a learner.

Our Pros in the house can help too
Re: Chronicle Of A Data Scientist/analyst by randomShek: 6:52pm On Jul 28, 2020
Hello guys

I just got the MySQL tutorial by Mosh and also downloaded the one by Mike Dane (Giraffe Academy). Which video/ course would you recommend for advance SQL

Or are those two videos enough?
Re: Chronicle Of A Data Scientist/analyst by BelieverDE: 7:53pm On Jul 28, 2020
KlausMichaelson:





Good day Sir to answer your question below:

**But i observed that what is being done using Pandas and Matplolib can be done comfortably with excel(Power Queries)**

As an intermediate, I'll like to let you know why pandas is much more flexible than Excel.
Here are my reasons

(1). Using Pandas, you can easily create a new cell(or column) that could do a multiple computations(addition, Subtraction, formulae etc) using values from other cells(or columns) in just a single line of code.
But Excel is limited to that as you can only
do it for each cell in a new column one by one and this can very cumbersome.

(2). Excel doesn't help give you a visualization of your preferred taste. That doesn't mean you can't design a plot( it's axis, Gradient line, ticks and the likes). But Pandas is more flexible using Matplotlib and pyplot, you can easily give your plot a good visualization of your choice.You can easily change marker colour, line size, line colour and the likes by just a simple line of code(even shorthand notation)

There are other reasons why I think Pandas/Matplotlib/Pyplot is more flexible than Excel but those are the ones I can remember at the moment. Although I stand to be corrected grin cos I'm still a learner.

Our Pros in the house can help too
I have been learning Excel for close to 4 months. Everything you described above can be done with Excel.

If you load your data to the Data Model or convert the table to an Excel Table, you can perform calculations to a column.

You can also design the graph to your preferences described above.

4 Likes

Re: Chronicle Of A Data Scientist/analyst by Dahyormine(m): 8:03pm On Jul 28, 2020
Henry651:
This is odd I know but am new here and I don’t know how I can upload a story here .
reply your email, I sent you a pm
Re: Chronicle Of A Data Scientist/analyst by ibromodzi: 8:47pm On Jul 28, 2020
BelieverDE:

I have been learning Excel for close to 4 months. Everything you described above can be done with Excel.

If you load your data to the Data Model or convert the table to an Excel Table, you can perform calculations to a column.

You can also design the graph to your preferences described above.

Flexibility.....
Re: Chronicle Of A Data Scientist/analyst by ibromodzi: 8:57pm On Jul 28, 2020
KlausMichaelson:
I'm so happy right now grin grin

I mean it's already a month and 6days in this journey. Before now, around 2019 or so, before school resumption I watched some videos on python but I didn't take it too serious. And yes I have a good foundation in Excel since my 300lvl cos I use it to plot assignments on graphs(A bit complex ones) and I make lot of money from it grin . I also understand the syntaxes and it's operations but I would still love to know more.

By June 20th 2020(last month) I saw this thread and I became very serious to know more about Data analysis. So serious that I had to Deactivate my Whatsapp cos it's always taking my precious time away.

Now here I am today and here are the things I have learnt after staying away from Social media for a month and 6days.
(1). Python (Intermediate tho smiley )
(2). Pandas (Intermediate) although I can do almost anything with it but I won't still call myself a pro. Pandas is much better than excel as it could run many cells operations.
(3). Matplotlib (Same as pandas, my wonderful friend)
(4). Numpy ( still a learner cos I haven't seen much of it's use). And lastly
(5). Pyplot (my very good friend also)
Currently I'm leaning Seaborn and Powerbi.


Honestly speaking I know there are much more to learn sad . But seeing that I'm still a student, I feel I need to focus more on the ones that are more important to me as a student. So my targets are:

(1). Master all these tools in Data Analysis I've learnt so far. Later on after my studies, I'll learn Sql, Tableau and others.

(2). Find out which tool is best for ANOVA design table.

(3) Do Data Analysis for your fellow final year students undergraduate report and get paid
(4). Conduct Seminars for Students in Data Analysis and get paid grin

Please do well to profer any other library or tool to help meet my targets especially in the ANOVA design.

Thank you @Ejiod for the wonderful thread and to everyone on this platform

Any reason why ANOVA is specifically mentioned?Do you even know when to use it in a statistical analysis?
Are you not supposed to focus on all the concepts of descriptive and inferential analytics if your aim is academics?

See, data analysis is more than just mastering the tools, the theoretical knowledge and application of statistics are equally important.

1 Like

Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 9:25pm On Jul 28, 2020
ibromodzi:


Any reason why ANOVA is specifically mentioned?Do you even know when to use it in a statistical analysis?
Are you not supposed to focus on all the concepts of descriptive and inferential analytics if your aim is academics?

See, data analysis is more than just mastering the tools, the theoretical knowledge and application of statistics are equally important.

Honestly, the reason why I'm very concerned about the use of ANOVA is simply because our Lecturers are always talking about it. I can actually do the calculations by the use of the various formula to get the required values for the ANOVA table. That's fine but what if I have a very large data?

The truth of the matter is that they wouldn't give me a straight forward answer as to the real life application of the ANOVA design neither will they tell me when to use it specifically whenever I ask them that question. They just beat around the bush.

ANOVA actually stands for Analysis of Variance and to my understanding, it is used to analyze Data that comes with a number of replications for a given Sample. It helps determine an experiment's Standard error, and some other parameters which I can't remember. My school books are far away from me at the moment.


Anyways, the truth of the matter is that you're right about the fact that one needs to be grounded in not only the Practical but the theoretical part of Data Analysis. Why I felt that ANOVA table could be gotten from any of these libraries or tools is simply because of a code I came across on pandas **df.describe( )**. It brings up a table showing some statistical analysis of any data you provide for it.

Maybe there is no ANOVA design from any of the tools or libraries but there is absolutely no problem. I'll work towards my other targets.

1 Like 1 Share

Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 9:30pm On Jul 28, 2020
BelieverDE:

I have been learning Excel for close to 4 months. Everything you described above can be done with Excel.

If you load your data to the Data Model or convert the table to an Excel Table, you can perform calculations to a column.

You can also design the graph to your preferences described above.

Wow, I haven't gotten to that level then. Thanks for the insight.
Do you mean you can do a computation on a new column that requires values from other columns at just a single input? And it will also give answers to every cell in the new column? At just one single input or formula??
Re: Chronicle Of A Data Scientist/analyst by ibromodzi: 9:49pm On Jul 28, 2020
KlausMichaelson:


Honestly, the reason why I'm very concerned about the use of ANOVA is simply because our Lecturers are always talking about it. I can actually do the calculations by the use of the various formula to get the required values for the ANOVA table. That's fine but what if I have a very large data?

The truth of the matter is that they wouldn't give me a straight forward answer as to the real life application of the ANOVA design neither will they tell me when to use it specifically whenever I ask them that question. They just beat around the bush.

ANOVA actually stands for Analysis of Variance and to my understanding, it is used to analyze Data that comes with a number of replications for a given Sample. It helps determine an experiment's Standard error, and some other parameters which I can't remember. My school books are far away from me at the moment.


Anyways, the truth of the matter is that you're right about the fact that one needs to be grounded in not only the Practical but the theoretical part of Data Analysis. Why I felt that ANOVA table could be gotten from any of these libraries or tools is simply because of a code I came across on pandas **df.describe( )**. It brings up a table showing some statistical analysis of any data you provide for it.

Maybe there is no ANOVA design from any of the tools or libraries but there is absolutely no problem. I'll work towards my other targets.

You don't need any lecturer to tell you the type of statistics to use, your data should point you towards that. You really need to learn more about inferential statistics. There are many statistical tests available and a lot of factors need to be considered before choosing the right one. As for pandas, I wonder if you know what being an intermediate implies. df.describe() does not give you ANOVA. You still have a lot of homework to do.
Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 9:55pm On Jul 28, 2020
ibromodzi:


You don't need any lecturer to tell you the type of statistics to use, your data should point you towards that. You really need to learn more about inferential statistics. There are many statistical tests available and a lot of factors need to be considered before choosing the right one. As for pandas, I wonder if you know what being an intermediate implies. df.describe() does not give you ANOVA. You still have a lot of homework to do.


Lols Is I never said df.describe() gives ANOVA Sir. if it gives ANOVA why then should I still be bothered about ANOVA this or ANOVA that.

What I said was that df.describe()
brings up a sort of table showing some statistical analysis(Mean, Std, etc) of any data you provide for it and not ANOVA sir.

Anyways thanks for your advice.

1 Like

Re: Chronicle Of A Data Scientist/analyst by ibromodzi: 10:06pm On Jul 28, 2020
KlausMichaelson:



Lols Is I never said df.describe() gives ANOVA Sir. if it gives ANOVA why then should I still be bothered about ANOVA this or ANOVA that.

What I said was that df.describe()
brings up a sort of table showing some statistical analysis(Mean, Std, etc) of any data you provide for it and not ANOVA sir.

Anyways thanks for your advice.

As a matter of fact, df.describe() just gives you an idea of the data you are dealing with. Like a statistical summary. Although, there could be more than a way to carry out in-depth inferential analytics, scipy library in conjunction with researchpy and pandas should always give you what you want.

Modified:
Steps in determining what tests to use in a statistical analysis;
1. What type of variables are you dealing with? Numerical or categorical?

2. What type of analysis are you doing?
a. comparison (mean,median,proportion); this is where you use t-tests, ANOVA, etc.. depending on the distribution of the data(parametric or not)
b. Relationship between two variables (say gender and smartness); this is where you use correlation (Pearson or Spearman rank)
c. Predicting one variable from another (say exposure to smoke predicts the risk of lung cancer); this is a Regression task

3. Number of groups involved (say the effect of Chloroquine in Covid 19 and non Covid 19 patients - 2 groups)

4. Distribution of your data (Normal or not); there are different ways to determine this.

I believe you now understand the point I was talking from when I mentioned theoretical knowledge of statistics.

5 Likes

Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 10:18pm On Jul 28, 2020
ibromodzi:


As a matter of fact, df.describe() just gives you an idea of the data you are dealing with. Like a statistical summary. Although, there could be more than a way to carry out in-depth inferential analytics, scipy library in conjunction with researchpy and pandas should always give you what you want.

Thank you very much Sir. I really appreciate.

1 Like

Re: Chronicle Of A Data Scientist/analyst by HewlettPackard: 12:35am On Jul 29, 2020
wisemania:


https://drive.google.com/uc?id=1riG_93vaoPUG81V7f9dq2E95LAnhcTkc&export=download

https://www.freetutorialsus.com/sql-mysql-for-data-analytics-and-business-intelligence-udemy-course-free-download/


The first teaches design while the second teaches you manipulation


Take your time.

thanks.

the first link is requesting password to the files, any help?
Re: Chronicle Of A Data Scientist/analyst by wisemania(m): 2:31am On Jul 29, 2020
HewlettPackard:


thanks.

the first link is requesting password to the files, any help?

Getfreetutorial.com

2 Likes

Re: Chronicle Of A Data Scientist/analyst by Dum20: 6:23am On Jul 29, 2020
BelieverDE:

I have been learning Excel for close to 4 months. Everything you described above can be done with Excel.

If you load your data to the Data Model or convert the table to an Excel Table, you can perform calculations to a column.

You can also design the graph to your preferences described above.


Excel is very powerful and graphical at the same time, you are seeing results as you go.

I know @Ejiod said he outshined does using Excel it will be great if we get his thoughts and other ogas on this.
Re: Chronicle Of A Data Scientist/analyst by Dum20: 6:23am On Jul 29, 2020
ibromodzi:


As a matter of fact, df.describe() just gives you an idea of the data you are dealing with. Like a statistical summary. Although, there could be more than a way to carry out in-depth inferential analytics, scipy library in conjunction with researchpy and pandas should always give you what you want.

[b]Modified:

Steps in determining what tests to use in a statistical analysis;
1. What type of variables are you dealing with? Numerical or categorical?

2. What type of analysis are you doing?
a. comparison (mean,median,proportion); this is where you use t-tests, ANOVA, etc.. depending on the distribution of the data(parametric or not)
b. Relationship between two variables (say gender and smartness); this is where you use correlation (Pearson or Spearman rank)
c. Predicting one variable from another (say exposure to smoke predicts the risk of lung cancer); this is a Regression task

3. Number of groups involved (say the effect of Chloroquine in Covid 19 and non Covid 19 patients - 2 groups)

4. Distribution of your data (Normal or not); there are different ways to determine this.

I believe you now understand the point I was talking from when I mentioned theoretical knowledge of statistics.
[/b]


Thank you for this

1 Like

Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 6:45am On Jul 29, 2020
ibromodzi:



Steps in determining what tests to use in a statistical analysis;
1. What type of variables are you dealing with? Numerical or categorical?

2. What type of analysis are you doing?
a. comparison (mean,median,proportion); this is where you use t-tests, ANOVA, etc.. depending on the distribution of the data(parametric or not)
b. Relationship between two variables (say gender and smartness); this is where you use correlation (Pearson or Spearman rank)
c. Predicting one variable from another (say exposure to smoke predicts the risk of lung cancer); this is a Regression task

3. Number of groups involved (say the effect of Chloroquine in Covid 19 and non Covid 19 patients - 2 groups)

4. Distribution of your data (Normal or not); there are different ways to determine this.

I believe you now understand the point I was talking from when I mentioned theoretical knowledge of statistics.

Yes my Oga. Thank you for this. I really appreciate Sir.

1 Like

Re: Chronicle Of A Data Scientist/analyst by BelieverDE: 8:32am On Jul 29, 2020
KlausMichaelson:


Wow, I haven't gotten to that level then. Thanks for the insight.
Do you mean you can do a computation on a new column that requires values from other columns at just a single input? And it will also give answers to every cell in the new column? At just one single input or formula??
Yes
Re: Chronicle Of A Data Scientist/analyst by KlausMichaelson: 8:34am On Jul 29, 2020
BelieverDE:

Yes

Wow I never knew. Thanks
Re: Chronicle Of A Data Scientist/analyst by BelieverDE: 8:39am On Jul 29, 2020
ibromodzi:



Modified:
Steps in determining what tests to use in a statistical analysis;
1. What type of variables are you dealing with? Numerical or categorical?

2. What type of analysis are you doing?
a. comparison (mean,median,proportion); this is where you use t-tests, ANOVA, etc.. depending on the distribution of the data(parametric or not)
b. Relationship between two variables (say gender and smartness); this is where you use correlation (Pearson or Spearman rank)
c. Predicting one variable from another (say exposure to smoke predicts the risk of lung cancer); this is a Regression task

3. Number of groups involved (say the effect of Chloroquine in Covid 19 and non Covid 19 patients - 2 groups)

4. Distribution of your data (Normal or not); there are different ways to determine this.

I believe you now understand the point I was talking from when I mentioned theoretical knowledge of statistics.

Thanks

1 Like

Re: Chronicle Of A Data Scientist/analyst by HewlettPackard: 11:00am On Jul 29, 2020
wisemania:

Getfreetutorial.com
Thanks man.
Re: Chronicle Of A Data Scientist/analyst by Samzeal(m): 7:02pm On Jul 29, 2020
Re: Chronicle Of A Data Scientist/analyst by Dum20: 8:14pm On Jul 30, 2020
Na wa ooo.

Where is everyone?

It seems the ogas are enjoying Sallah break
Re: Chronicle Of A Data Scientist/analyst by HewlettPackard: 10:30pm On Jul 30, 2020
Regards2U:
LinkedIn Learning: Data Analyst

It's a free course from LinkedIn Learning for those who want to become a data analyst

Valid till march 2021

Link: https://www.linkedin.com/learning/paths/become-a-data-analyst?trk=li-data-become-en&src=re-other&veh=%7Cre-other

Nice. Anyway of downloading the videos for offline use?

Modified: got it, service avialiable on the app.
Re: Chronicle Of A Data Scientist/analyst by Ejiod(m): 7:19am On Jul 31, 2020
Hey guys.... just finished reading Medium this morning and felt like sharing. There’s currently a new revolution by tech giants now regarding content learning. Please guys try visiting Grow with Google

6 Likes 1 Share

(1) (2) (3) ... (58) (59) (60) (61) (62) (63) (64) ... (146) (Reply)

I Want To Learn Programming. Which Language Should I Start With?

(Go Up)

Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health
religion celebs tv-movies music-radio literature webmasters programming techmarket

Links: (1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

Nairaland - Copyright © 2005 - 2024 Oluwaseun Osewa. All rights reserved. See How To Advertise. 69
Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.