Back to blog
ML

My First Attempt at Teaching a Machine

Throughout the post, I emphasize the key lessons I learned, including the importance of patience, persistence, and embracing challenges. I also highlight the difference between teaching a machine to think and simply using algorithms, explaining how my project combined both approaches to achieve powerful outcomes.

Alim··5
My First Attempt at Teaching a Machine

Introduction

Imagine sitting in a dimly lit room, surrounded by the hum of computers, as a group of eager minds gathers for a hackathon. The air is thick with anticipation and excitement, each participant ready to tackle the challenge of teaching machines to think like humans. This was my reality as I embarked on my journey into artificial intelligence (AI), driven by a fascination with intelligent systems that learn and adapt autonomously.

As an aspiring AI engineer, I often ponder the profound question: What does it truly mean to teach a machine to think? This inquiry has propelled me into the depths of AI, where I have encountered both exhilarating breakthroughs and daunting challenges. In this blog post, I will share my first attempt at teaching a machine to think, the obstacles I faced, and the invaluable lessons I learned along the way.

My Journey into Teaching Machines to Think


My fascination with AI ignited during my master's program in Digital Finance. It was here that I first applied AI concepts to real-world problems, such as financial customer segmentation. The thrill of seeing a machine analyze data and derive insights was akin to watching a child take their first steps—both exhilarating and transformative. This experience solidified my desire to explore the potential of machines that learn and adapt.

The Hackathon: A Crucible of Learning

My enthusiasm led me to participate in the Problematique Algorithm Solution (PAS) hackathon, which became the perfect testing ground for my skills.This event provided a unique opportunity to apply my theoretical knowledge in a practical setting.

The Challenge: Named Entity Recognition in African Languages


My team and I faced the challenge of named entity recognition (NER) in African languages, a task that seemed daunting yet exciting.

What is Named Entity Recognition (NER)?


Named Entity Recognition (NER) is a crucial technique in natural language processing (NLP) that identifies and classifies key elements in text into predefined categories such as names of people, organizations, locations, dates, and more. Imagine scanning a document to find all mentions of a specific company or person—NER automates this process, making it easier to extract structured information from unstructured text. For example, in the sentence "Steve Jobs co-founded Apple in 1976," NER would identify "Steve Jobs" as a person, "Apple" as an organization, and "1976" as a date. This capability is essential for various applications, including search engines, chatbots, and social media monitoring, as it helps machines understand the context and relationships within the text. This challenge pushed us to explore various models, including LSTM, BERT, RoBERTa, and XLM-RoBERTa.

How NER Contributes to AI Performance

NER plays a pivotal role in enhancing the overall performance of AI systems. Here’s how:

Information Extraction: NER transforms raw text into structured data, allowing AI systems to analyze and utilize information effectively. This is crucial since approximately 80% of all data is unstructured.
Improving Contextual Understanding: By identifying entities and their relationships, NER helps AI systems grasp the meaning of sentences more accurately, which is vital for applications like chatbots and virtual assistants.
Facilitating Advanced Analytics: NER enables advanced analytics across various domains, allowing organizations to derive insights from large volumes of data efficiently.
Enhancing Search and Recommendation Systems: By accurately identifying entities, NER improves the relevance of search results and personalizes recommendations based on user interactions.
Supporting Sentiment Analysis: NER aids in sentiment analysis by identifying entities mentioned in customer reviews and social media posts, enabling businesses to gauge public opinion more accurately.
Streamlining Customer Support: NER helps categorize and prioritize customer inquiries, allowing support teams to respond more efficiently and enhance the overall customer experience.
Reducing Human Error: Automating the extraction and categorization of entities minimizes human error in data analysis, improving the accuracy of insights derived from the data.

Teamwork

We divided our tasks efficiently:

I focused on training the model and laying the groundwork for our application.
Teammate A managed communication and documentation.
Teammate B developed a user-friendly interface.

The mentorship we received was invaluable, providing practical insights that boosted our confidence.


Training the Model


Article image

Training the model was a fascinating process. I watched in awe as the machine learned, adjusting its parameters to minimize errors. However, we faced challenges like overfitting, where the model excelled on training data but faltered on test data. To combat this, we implemented techniques like dropout and regularization, gradually improving the model's performance. The sense of accomplishment as our accuracy increased was exhilarating.


After extensive training and fine-tuning, it was time to test my model on new data. With a mix of excitement and nervousness, I ran the test. To my delight, the model accurately identified most of the handwritten digits! This exhilarating moment reaffirmed my belief in the importance of continuous testing and refinement in developing superior models.

From Model to Application


Article image

Image showing the NER app: An example of testing in Wolof

After refining our model, we moved to the next phase:

Teammate B created an impressive endpoint and UI for processing PDFs and long texts.
We deployed the model to the cloud, setting up an ngrok server on GCP for accessibility.

Although we didn’t win the hackathon, we were thrilled to place in the top 2 of the Kaggle challenge, competing against highly skilled ML engineers. The thrill of Kaggle competitions is unmatched; even as the competition nears its end, uncertainty looms over whether your score will hold. It was an exhilarating experience filled with anticipation.


Key Lessons Learned

The Power of Collaboration: Our diverse team allowed us to leverage individual strengths and tackle challenges more effectively.
Embracing the Learning Curve: Teaching machines to think requires patience, experimentation, and a willingness to learn from mistakes.
Data is King: The quality and diversity of training data significantly impact model performance.

Teaching a Machine to Think vs. Using Algorithms


Through this experience, I gained a deeper understanding of the distinction between programming machines to follow algorithms and enabling them to think and solve problems independently.

Teaching Machines to Think

Conceptual Understanding: Develops systems that learn from data, draw conclusions, and make decisions.
Learning from Experience: Involves machine learning techniques to identify patterns and relationships in data.
Cognitive Flexibility: Enables application of knowledge to new and unfamiliar problems.

Using Algorithms

Defined Procedures: Provides clear, step-by-step instructions for specific tasks.
Static Solutions: Offers fixed solutions that don't adapt to new data or contexts.
Limited Adaptability: Excels at specific tasks but lacks the ability to learn or improve over time.

Bridging the Gap

In our project, we found that combining both approaches yielded powerful results. We used algorithms for data handling and analysis within our machine learning model, while the learning component allowed for continuous improvement.

Conclusion

Understanding the distinction between teaching a machine to think and using algorithms is crucial for anyone in AI. Algorithms provide structure, but the ability to think and learn enables machines to adapt and solve complex problems that static programming cannot address. As I continue exploring AI, I am eager to see how these concepts can converge to create intelligent systems that truly learn and evolve. If this post has sparked your interest or if you have your own machine learning experiences to share, I’d love to hear from you!



Related Articles