₦airaland Forum

Welcome, Guest: RegisterLoginWith GoogleTrendingRecentNew

Stats: 3,329,226 members, 8,439,387 topics. Date: Sunday, 05 July 2026 at 07:16 AM

Toggle theme

Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT - Programming - Nairaland

Nairaland ForumScience/TechnologyProgrammingSaheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT (17248 Views)

1 2 Reply (Go Down)

Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by OneOnland(op): 1:33pm On Feb 04, 2025
After creating 2 million GPT tokens, this UNILAG student has built an AI text-to-speech model with Nigerian accent

Azeez Saheed YarnGPT founder

However, in my latest conversation with him, his new passion project seems to have pushed him further. He calls it YarnGPT, a text-to-speech AI model that can read text aloud in a Nigerian accent.

In a world where AI can generate lifelike voices in seconds, a text-to-speech model with a Nigerian accent might not seem groundbreaking at first. But when you consider two things, it becomes a big deal.

First, Azeez is a Nigerian university student with limited resources. Second, developing a model that accurately captures the nuances of a Nigerian accent is technically challenging.

From tokenising audio to the many mathematical concepts Azeez referenced while explaining the process, it was clear that this wasn’t a simple task. Even Azeez, in his usual fashion, didn’t downplay the effort involved.

"It was quite tasking, especially gathering the data needed to make this happen."

How YarnGPT was created
Inspired by naijaweb, Azeez was eager to build something new. "The amount of conversations and interest people had in Naijaweb was a great motivation. Imagine getting featured on Techpoint Africa; it motivated me to do this."

He was also motivated by failure. Before starting YarnGPT, he had applied for a job at a Nigerian AI company but didn’t perform as well in the interview as he had expected.

YarnGPT became the project that would help him improve his skills and increase his chances of securing such roles in the future.

Building an AI model that sounds Nigerian required gathering a vast amount of Nigerian voices.

"I used some movies that were available online. I extracted their audio and subtitles."

Nollywood produces over 2,500 movies a year, and with many filmmakers uploading their work to YouTube, it seemed like Azeez had plenty of data to work with. But in reality, he had almost none.

"The problem with building in Nigeria is data. Replicating what has been built overseas isn’t that hard, but data always gets in the way."

While there are thousands of movies for him to choose from the audio wasn't up to the standard he wanted, and their subtitles were inaccurate. To compensate, Azeez turned to Hugging Face, an open-source platform for machine learning and data science. He combined the audio from Nigerian movies with high-quality datasets from Hugging Face to train his model.

The next step was training the AI model, but without access to his own GPU, he had to rely on cloud computing services like Google Colab. This cost him $50 (₦80,000) — a significant amount for a university student. Unfortunately, it was a waste.

"The model I built wasn’t working well, and the $50 cloud credit was burnt just like that. It was painful for me."

Determined to find another way, he discovered Oute AI, a platform that had developed a text-to-speech model in an autoregressive manner.

"The way the model works is, you give it a piece of text, and it predicts one word at a time. It takes that word, adds it back to the text, then predicts the next one — kind of like how ChatGPT completes sentences. That’s what makes it autoregressive."

While I found the autoregressive framework difficult to understand, Azeez pointed out that it simply gave him better results.

Maths, tokenisation, and the hard part of YarnGPT
Oute AI provided a structure, but Azeez still had to build his own model. He took a language model called SmolLM2-360M from Hugging Face and added speech functionality to it, a process that involved major algorithmic changes.

After this, the final-year Mechanical Engineering student at the University of Lagos had to spend another $50 to train the model. The training took three days.

Interestingly, like he pointed out when he created Naijaweb, AI models need data to be tokenised. Large language models (LLMs) understand numbers, not words, so tokenisation converts words into numerical representations.

"If we were to tokenise the word CALCULATED, for example, we could split it into four tokens: CAL-CU-LA-TED. A number is assigned to each token."

Meanwhile, tokenizing audio is different.

"Tokenizing audio is basically breaking down continuous sound waves into smaller, manageable pieces that a model can understand and process. Unlike text, which has clear breaks between words, audio is continuous—there are no natural pauses in a raw waveform.

"So, the model needs to convert the sound into a sequence of discrete values, kind of like turning a long speech into tiny puzzle pieces. These smaller audio tokens can then be used to train the AI, and later, the model can reassemble them to generate speech that sounds natural."

This entire process was made possible by a wave tokenizer. Using resources from Hugging Face, Oute AI, and other Nigerian repositories, Azeez was able to create YarnGPT.

Publicising YarnGPT
Saheed Azeez: He built Naijaweb which is 230 million GPT2 tokens based on nairaland
Azeez might be a nerd, but he isn’t afraid to put himself in front of a camera to showcase his work. In a two-minute video, he explained YarnGPT and caught the attention of 138,000 people on X (formerly Twitter), including Timi Ajiboye, Co-founder of Hellicarrier (formerly BuyCoins).

Creating YarnGPT was difficult, but making the video was another hurdle.

"I called my friend and logistics manager, Aremu, and told him I wanted to make a video. We reached out to another friend who had a camera he wasn’t even using, and then we went to yet another friend’s house to record.

"We rearranged the whole house and used their TV as the background. His mum wasn’t too pleased when she returned."

The results were worth it. The video got thousands of views across social media, and people began testing YarnGPT. The model could not only pronounce English in a Nigerian accent but could also read Nigerian languages—Hausa, Igbo, and Yoruba.

It has various applications. Content creators can use it for voice-overs in Nigerian accents, Google Maps could provide directions in Nigerian languages, and it could even enhance accessibility for non-English speakers.

Nigeria and the AI race
While innovators like Azeez and American-born Ijemma Onwuzulike (creator of Igbo Speech) are developing exciting AI models, Nigeria remains far behind in the AI race. The industry has evolved beyond a hobbyist’s playground into a battleground for global superpowers, with the U.S. government committing $500 billion to AI development.

Meanwhile, AI breakthroughs like DeepSeek have shaken up Wall Street, causing giants like Nvidia to lose billions in market value due to new competition.

Even Azeez acknowledges Nigeria’s position.

"Honestly, we’re way off. We’re not even in the race. The big AI models today — like OpenAI’s or the ones from China — are trained on massive datasets with huge computational resources, things we don’t have here."

But he remains optimistic.

"I think there’s a way forward. Instead of trying to build from scratch, we can focus on localising AI for our own needs. We can take what’s already been built and adapt it for Nigerian languages and accents. That’s how we can start catching up."

Nigeria’s Minister of Communications and Digital Economy, Bosun Tijani, has been vocal about positioning the country as a key player in AI development. Perhaps, with talents like Azeez, there is hope.
Source:
https://techpoint.africa/2025/02/04/how-unilag-student-created-yarngpt-ai/

Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by EmperorIsaac(m): 1:49pm On Feb 04, 2025
Good!
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by MindHacker9009(m):
Good. Next University Electrical Engineers and their professors should create cheap 24 hours electricity from local materials.

Alphabyte3:
While in school one lecturer said we should embark on hydroelectricity project to create it using the river at the back of the university to turn a turbine. He didn't even do a solid estimate. The cost was about N5m the next lecture he stopped because of funding.
A turbine is crafted from locally sourced raw materials. Edo State has a traditional iron smelting factory and copper craftsmanship too, so your professor should have visited these areas to give them his design specifications for his turbine design which will not cost much and with free student labour cheap electricity could be made available.
This approach is used in developed countries, where local traditional factories are used for such projects.


https://www.youtube.com/watch?v=TFsoXPRQnbg
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Alphabyte3: 6:02pm On Feb 04, 2025
MindHacker9009:
Good. Next University Electrical Engineers and their professors should create cheap 24 hours electricity from local materials.
While in school one lecturer said we should embark on hydroelectricity project to create it using the river at the back of the university to turn a turbine. He didn't even do a solid estimate. The cost was about N5m the next lecture he stopped because of funding.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by OneOnland(op): 9:02am On Feb 05, 2025
Nlfpmod, shouldn't we recognize our Yarngpt? It's open-source.

https://github.com/saheedniyi02/yarngpt
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by GOFRONT(m): 10:41am On Feb 05, 2025
cool

Nice one......The sweetest side of a Male child.


Chaii.......Omoluabi.

Yoruba Amaka!!!

Our girls are on Tiktok, facebook reels and onlyfans disgracing us with their endless display of panties and towto related videos. Thats all their brain can process....... Saidaboj n co.......Tufiakwa

Besides, if you look behind Azeez in the pics, you would also see those two boys......You can see them tryna give a handshake like when Martinelli gave a handshake to Kai Havertz in celebration of their goals when they pummelled Mancity last weekend. Hahahaha....

With that Handshake in the picture, I feel those guys have something cooking already in terms of Innovation and technology. Go ahead guyz.......The sky is y'all Limit
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by KingAzubuike(f): 10:41am On Feb 05, 2025
Watch how his greedy lecturers and project supervisors hijack the project from him and turn it to their own.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by yesloaded: 10:42am On Feb 05, 2025
Impressive
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by UHLmoving: 10:42am On Feb 05, 2025
Ok
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Quebec91(m): 10:42am On Feb 05, 2025
cheesy
Alphabyte3:
While in school one lecturer said we should embark on hydroelectricity project to create it using the river at the back of the university to turn a turbine. He didn't even do a solid estimate. The cost was about N5m the next lecture he stopped because of funding.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Tobi2025: 10:43am On Feb 05, 2025
Nice one but where can we download it
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by D00msDay(m): 10:44am On Feb 05, 2025
Tech is the future. If u like still go school go dey study archaic courses, na labour market go teach u sense.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by franchasofficia: 10:45am On Feb 05, 2025
Amazing.


First time a Yoruba Moslem will be doing something positive and not like that terrorist called MURIC and the other kingpin hiding at Aso Rock shocked
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by ictjobber: 10:45am On Feb 05, 2025
Okay
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Seadogfellow: 10:45am On Feb 05, 2025
cheesy

Original Yoruba boy

Unlike other trading in hard drugs and skull mining. While other half baked seimi illiterates like yarimo freestuffsng and co are flooding nairaland defending their vegetable, drug wicken president
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Wealthoptulent(m): 10:45am On Feb 05, 2025
grin
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by festacman(m):
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Kingpele(m): 10:47am On Feb 05, 2025
Good
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Dougad: 10:47am On Feb 05, 2025
All these plenty stories without a link to the gpt?

And I don't want to see a tech point link again these fools put a model on dcreen that you can't exit without registering. That's pure nonsense.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by benuejosh: 10:48am On Feb 05, 2025
Making Nigeria proud.

The real developer!
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Meerahbel: 10:48am On Feb 05, 2025
grin
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by OneOnland(op): 10:49am On Feb 05, 2025
Tobi2025:
Nice one but where can we download it
Dougad:
All these plenty stories without a link to the gpt?
It's not hosted, I think. You're going to have to install it to your own machine or VM like how people get models on hugging face.

The link:

https://github.com/saheedniyi02/yarngpt
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Meerahbel: 10:49am On Feb 05, 2025
grin
MindHacker9009:
Good. Next University Electrical Engineers and their professors should create cheap 24 hours electricity from local materials.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Meerahbel: 10:50am On Feb 05, 2025
Dougad:
All these plenty stories without a link to the gpt?
I just tire.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by TrackerSK: 10:50am On Feb 05, 2025
Nice one brother,why others are still playing sporty bet
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Ajsmart(m): 10:50am On Feb 05, 2025
Commendable achievement
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by EmmyMaestro(m): 10:50am On Feb 05, 2025
Meanwhile a pastor that owns 5 cars is asking church members to donate money to buy a car for his wife's upcoming birthday
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Starboytwo(m): 10:51am On Feb 05, 2025
Na small small he dey start.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by Dougad: 10:51am On Feb 05, 2025
OneOnland:
It's not hosted, I think. You're going to have to install it to your own machine or VM like how people get models on hugging face.

The link:

https://github.com/saheedniyi02/yarngpt
I saw your post after I made my comment. Will fork it later.
Re: Saheed Azeez Created 2 Million GPT Tokens, Built An AI, YarnGPT by FreeStuffsNG: 10:51am On Feb 05, 2025
OneOnland:
After creating 2 million GPT tokens, this UNILAG student has built an AI text-to-speech model with Nigerian accent



Source:
https://techpoint.africa/2025/02/04/how-unilag-student-created-yarngpt-ai/
Awesome!

That's Faculty of Science, University of Lagos (UNILAG), at the background of his picture.
1 2 Reply

We Built An App That Lets You Send Money Via Bluetooth — Even Without NetworkChat Gpt Is Game Changer Makes Work EasyChat GPT Can Lie Oo!! Here What It Says About Peter Obi!!??234

List Of All IT Certifications 1611Defend Your Programming LanguageLearn How To Build A Desktop Software Using PYTHON – Tutorial On Nairaland