₦airaland Forum

Welcome, Guest: Join Nairaland / LOGIN! / Trending / Recent / New
Stats: 2,294,732 members, 5,040,450 topics. Date: Tuesday, 16 July 2019 at 04:03 PM

Artificial Intelligence And Machine Learning Group - Programming (12) - Nairaland

Nairaland Forum / Science/Technology / Programming / Artificial Intelligence And Machine Learning Group (20128 Views)

Survey - People Interested In Artificial Intelligence And Machine Learning / Data Analysis And Machine Learning Using R / The Future Of Machine Learning In Nigeria (2) (3) (4)

(1) (2) (3) ... (9) (10) (11) (12) (Reply) (Go Down)

Re: Artificial Intelligence And Machine Learning Group by Horus(m): 11:12am On Jun 09, 2018

https://www.youtube.com/watch?v=RzG8LQ4Ntuw

Data Science Nigeria Opens First Nigerian Artificial Intelligence HUB in UNILAG

Data Science Nigeria in furtherance of its drive to boost research and innovation in Nigeria, opened the first ever
Nigerian Artificial Intelligence Hub in the University of Lagos.

3 Likes

Re: Artificial Intelligence And Machine Learning Group by Desyner: 1:15am On Jun 16, 2018
4kings:

If you're dealing with comparing two documents like two news articles.
Then the importance similarities is not really high. Because if tf-idf is based on words and a news articles about "Israel and trump" would have similar words that tf-idf can use to rank similarities.
However for a better result consider applying stemming and stopwords to reduce noise.

But i just realised you were talking about document titles, well if that's the case you can use word embeddings to vectorize words to find meaning, however the approach towards word embeddings would tend to classify only contextual words and not in an antonyms-synoynms structure like "hot" and "cold" would have the same vector point.

Python has NLTK, TextBlob and Spacy which are NLP packages and they can be used easily to connect to wordnet to find words similarity, so i'm sure java would have similar packages or better still write one that connects to wordnet yourself.
This approach might not be effective because it will require making http requests on every word and you might not be able to afford to store all words and synonyms somewhere.

So if you are finding similarities in general plain text then that's difficult, word embeddings is the best choice i can think of right now and you could get already trained model online for your task but i doubt it will perform well for just general plain text unless your task is singular domain related then you can train the word embedding yourself.

The best way to approach this is to analyse the document itself and not just the title, with tf-idf to get a good result.
I finally implemented the tf-idf thing, it appears to be grossly inadequate for what i am looking to achieve.
My goal was to detect news with same message from various news vendor or media houses. I have tried matching them by titles, excerpt and full body but no luck so far. The stop words in tf-idf are really a headache, like u suggested lemmatization should help but i strongly suspect it may not solve the challenge of reducing stop words to acceptable level.

I am still looking for the solution to it just in case.
I felt since the WordPress similar post feature is so common I should be able to get this done in another language, java, but that's not the story so far.
Re: Artificial Intelligence And Machine Learning Group by tollyboy5(m): 2:40pm On Jun 17, 2018
pls anybody with embedded programming knowledge
Re: Artificial Intelligence And Machine Learning Group by raymod170(m): 9:06pm On Jun 17, 2018
tollyboy5:
pls anybody with embedded programming knowledge
Kindly state your problem so everyone makes an input
Re: Artificial Intelligence And Machine Learning Group by Desyner: 7:39pm On Jun 22, 2018
4kings:

If you're dealing with comparing two documents like two news articles.
Then the importance similarities is not really high. Because if tf-idf is based on words and a news articles about "Israel and trump" would have similar words that tf-idf can use to rank similarities.
However for a better result consider applying stemming and stopwords to reduce noise.

But i just realised you were talking about document titles, well if that's the case you can use word embeddings to vectorize words to find meaning, however the approach towards word embeddings would tend to classify only contextual words and not in an antonyms-synoynms structure like "hot" and "cold" would have the same vector point.

Python has NLTK, TextBlob and Spacy which are NLP packages and they can be used easily to connect to wordnet to find words similarity, so i'm sure java would have similar packages or better still write one that connects to wordnet yourself.
This approach might not be effective because it will require making http requests on every word and you might not be able to afford to store all words and synonyms somewhere.

So if you are finding similarities in general plain text then that's difficult, word embeddings is the best choice i can think of right now and you could get already trained model online for your task but i doubt it will perform well for just general plain text unless your task is singular domain related then you can train the word embedding yourself.

The best way to approach this is to analyse the document itself and not just the title, with tf-idf to get a good result.
OK. I made some progress. I was able to obtain the body of the news. This means I now have the title, the excerpt and the body text in full.

So far tf-idf is giving me unpleasant matches. for example the news about the psychic pig prediction super eagles semi final adventure appears along other news related to super eagle like in the same cluster. with the new about "super eagle team song by olamide and phyno" and the other one by simi et al.

My next target is to reduce stop words but I am not using a list of words directly. I am using POSTagger (Parts Of Speech Tagger) to eliminate the injunctions, modifiers and other category stop words generally fall into. With this approach words like for, from, a, to, with, am, when, soon, & am get eliminated because of the category POSTagger is placing them and not because I specified those words directly.


Something tells me you know what I need but I haven't been able to communicate it in the right words to you yet. As I google more I am learning new things. I know what lemmatization, tokenization, clustering is now.


Modified:
What I am looking for is clustering of news covering same issue, event or happening. If given 10 articles from a number of news outlets (e.g vanguard, punch, tribune and cable) and five of those are reporting Buhari's expose of budget padding by senate and HOR , Then I need to be able to cluster those five together and nothing more. I hope this explains it to you.
Re: Artificial Intelligence And Machine Learning Group by 4kings: 10:49am On Jun 23, 2018
Desyner:
OK. I made some progress. I was able to obtain the body of the news. This means I now have the title, the excerpt and the body text in full.

So far tf-idf is giving me unpleasant matches. for example the news about the psychic pig prediction super eagles semi final adventure appears along other news related to super eagle like in the same cluster. with the new about "super eagle team song by olamide and phyno" and the other one by simi et al.

My next target is to reduce stop words but I am not using a list of words directly. I am using POSTagger (Parts Of Speech Tagger) to eliminate the injunctions, modifiers and other category stop words generally fall into. With this approach words like for, from, a, to, with, am, when, soon, & am get eliminated because of the category POSTagger is placing them and not because I specified those words directly.


Something tells me you know what I need but I haven't been able to communicate it in the right words to you yet. As I google more I am learning new things. I know what lemmatization, tokenization, clustering is now.


Modified:
What I am looking for is clustering of news covering same issue, event or happening. If given 10 articles from a number of news outlets (e.g vanguard, punch, tribune and cable) and five of those are reporting Buhari's expose of budget padding by senate and HOR , Then I need to be able to cluster those five together and nothing more. I hope this explains it to you.

Hmm, interesting.
You've not used word embeddings yet? That tends to improve performance.
And also u could perform some semantic analysis tricks on subjects of the articles.

I'm not with my system at the moment, I would work on this and get back to you when I get home.
Re: Artificial Intelligence And Machine Learning Group by 4kings: 10:51am On Jun 23, 2018
Horus:

https://www.youtube.com/watch?v=RzG8LQ4Ntuw

Data Science Nigeria Opens First Nigerian Artificial Intelligence HUB in UNILAG

Data Science Nigeria in furtherance of its drive to boost research and innovation in Nigeria, opened the first ever
Nigerian Artificial Intelligence Hub in the University of Lagos.
Good development!!!

2 Likes

Re: Artificial Intelligence And Machine Learning Group by lum1: 8:24am On Jul 26, 2018
Sorry I have been offline guys. I have been writing my PhD thesis(still writing).
Re: Artificial Intelligence And Machine Learning Group by 4kings: 5:18am On Jan 17
Sad... This thread was promising.
Life happens. oops.
Watsup with SoftEng?
Re: Artificial Intelligence And Machine Learning Group by SoftEng: 12:30pm On Mar 01
4kings:
Sad... This thread was promising.
Life happens. oops.
Watsup with SoftEng?

4kings and all
I apologise for my absence (Yes, life happens smiley).
I have been mostly busy with different things.

Meanwhile, there are communities that have been very active in the AI space.
Data Science Nigeria and AI saturday Lagos have been really doing well. AI saturday abuja also have some activities.
Also an AI event called IndabaXNigeria will be happening in Univeristy of Lagos in May.

I'm not sure whether I can be as active on this thread as I used to be. However, this should NOT dissuade the purpose of the thread. At bare minimum, if anyone needs help or has a question, I can assist.

Thanks 4kings, you've been very helpful and you have been a good friend on this thread.

1 Like

Re: Artificial Intelligence And Machine Learning Group by odizeey(m): 4:46pm On Mar 11
SoftEng:


4kings and all
I apologise for my absence (Yes, life happens smiley).
I have been mostly busy with different things.

Meanwhile, there are communities that have been very active in the AI space.
Data Science Nigeria and AI saturday Lagos have been really doing well. AI saturday abuja also have some activities.
Also an AI event called IndabaXNigeria will be happening in Univeristy of Lagos in May.

I'm not sure whether I can be as active on this thread as I used to be. However, this should NOT dissuade the purpose of the thread. At bare minimum, if anyone needs help or has a question, I can assist.

Thanks 4kings, you've been very helpful and you have been a good friend on this thread.
nice one bro.
Re: Artificial Intelligence And Machine Learning Group by Avast(m): 8:53am On Mar 23
I have been on learning path since last year November, I started with learning python basic and move over to Python libraries. I started ML month ago but have not really seem to understand it. Actually I understand the theoretical aspect of it. I am doing self learning, as a beginner I need a place in Lagos that I can always go to meet people on the same path so I can learn physically and have more practical experience. self learning is not really working for me. Thanks

SoftEng
4kings
Darivie04
Re: Artificial Intelligence And Machine Learning Group by odizeey(m): 9:49am On Mar 23
Avast:
I have been on learning path since last year November, I started with learning python basic and move over to Python libraries. I started ML month ago but have not really seem to understand it. Actually I understand the theoretical aspect of it. I am doing self learning, as a beginner I need a place in Lagos that I can always go to meet people on the same path so I can learn physically and have more practical experience. self learning is not really working for me. Thanks

SoftEng
4kings
Darivie04
browse AI6 Lagos
Re: Artificial Intelligence And Machine Learning Group by Pdsco: 2:50pm On Jun 16
odizeey:
browse AI6 Lagos

Thanks so much for this.
Re: Artificial Intelligence And Machine Learning Group by odizeey(m): 6:20pm On Jun 16
Pdsco:


Thanks so much for this.
welcome. You joined the last cohort?
Re: Artificial Intelligence And Machine Learning Group by Pdsco: 6:29pm On Jun 16
odizeey:
welcome. You joined the last cohort?

Tried joining but error on the page.
Re: Artificial Intelligence And Machine Learning Group by odizeey(m): 6:38pm On Jun 16
Pdsco:


Tried joining but error on the page.
ok. It just ended, another will be coming up.
Re: Artificial Intelligence And Machine Learning Group by Darivie04(m): 4:38pm On Jun 17
Avast:
I have been on learning path since last year November, I started with learning python basic and move over to Python libraries. I started ML month ago but have not really seem to understand it. Actually I understand the theoretical aspect of it. I am doing self learning, as a beginner I need a place in Lagos that I can always go to meet people on the same path so I can learn physically and have more practical experience. self learning is not really working for me. Thanks

SoftEng
4kings
Darivie04
Learning on one's own can be very discouraging since you have few people to talk and share ideas with. There are some online communities but they don't compare to physical communication with other people.

I don't live in Lagos so I can't suggest places in Lagos but I'm sure some others can.
Re: Artificial Intelligence And Machine Learning Group by Avast(m): 4:42pm On Jun 17
Darivie04:

Learning on one's own can be very discouraging since you have few people to talk and share ideas with. There are some online communities but they don't compare to physical communication with other people.

I don't live in Lagos so I can't suggest places in Lagos but I'm sure some others can.
It is not easy
Re: Artificial Intelligence And Machine Learning Group by Raymagnate(m): 5:23pm On Jul 11
Dive into the top 10 Artificial Intelligence Startups in Africa....Some of these tech companies can put money in your pocket by providing services to people around you, some can help improve crop yields and early disease detection to farmers etc. There's a lot to learn from these AI startups.

Watch below...


https://www.youtube.com/watch?v=VHLdsL2xE9w

(1) (2) (3) ... (9) (10) (11) (12) (Reply)

MTN App Developers Competition Thread / I Want To Learn Computer Programming, What Language Should I Learn First? / Java Tutorial For Beginners

(Go Up)

Sections: politics (1) business autos (1) jobs (1) career education (1) romance computers phones travel sports fashion health
religion celebs tv-movies music-radio literature webmasters programming techmarket

Links: (0) (1) (2) (3) (4) (5) (6) (7) (8) (9)

Nairaland - Copyright © 2005 - 2019 Oluwaseun Osewa. All rights reserved. See How To Advertise. 111
Disclaimer: Every Nairaland member is solely responsible for anything that he/she posts or uploads on Nairaland.