Machine learning introduction through practice

How it started

A few weeks ago, I decided to learn a bit more about machine learning and figured the best way to do so was to write and debug some code plugged into a machine learning model. A neat bonus would be to figure way for this to run on a tiny device without using an external model API so that the cost is limited to that of my electricity bill.

Now, I’ve thrown a few terms that I never wrote about, so it is good to share some definitions before going any further.

First, what is machine learning? It is an artificial intelligence-focused branch of computer science where the aim is to render machines able to perform tasks more and more autonomously through exposure to data called training. A machine learning model is the output of such training that can be stored anywhere bytes can go, like in a local file or the cloud.

You likely heard of models like Claude or Deepseek AI, but there are many more models out there trained to perform a variety of tasks, from movie recommendation to API threat detection. In today’s post, I want to keep things simple and will use a simple model that, given a phrase, will tell me whether it is negative or positive. This is called sentiment detection.

Picking the right API/machine learning model couple

Since I have little clue about what I am doing here, I decided to research my way through it by using Grok 3, an AI agent developed by X. It is similar to the more popular ChatGPT, but felt more accurate in the early days where I was looking for an AI agent to use in earlier days for recipes and so on. I stuck to it since.

The first step was to formulate my requirements clearly so that I would have the best result possible. AI agents are not magical, just like developers, if you feed them garbage input, you will likely get a garbage output.

I know that I want a sentiment detection model small enough to fit and run smoothly on a Pinephone. Since the only time I successfully built an app running on Pinephone was using .NET Core and C#, I needed a model that I could easily integrate with that.

My first step was to look for a C# framework that would allow me to integrate with a machine learning model. It didn’t take much searching to find ML.NET from Microsoft, which allows you to train your own and import other models.

My first question to Grok was whether I could load any model in ML.NET. The answer was yes, so long as the model was either trained using ML.NET or in the ONNX format.

From there, I asked my first of many prompts to understand which models could both smoothly run on a standard Pinephone and provide sentiment detection.

My Pinephone CPU is an Allwinner A64 (ARM arch @ 1.2 GHz) with 3 GB of RAM and 32GB of memory expanded by a 1TB microSD card. The GPU is a Mali-400 MP2, which is not optimised for heavy computations.

These requirements mean that I need a lightweight model with few parameters so that my tiny Pinephone CPU can handle it and give me a result within a few seconds. In the machine learning context, a parameter is an internal variable dynamically created during training which can have an impact on predicting outcomes.

In our case, we want to predict whether a given sentence is negative or positive by using a model that would be trained on both positive and negative sentences.

After about an hour alternating between googling and groking, I eventually landed on Google’s MobileBERT as my model of choice.

One of the key advantages is how small that text-classification model is. Its PyTorch version weighs just under 150MB at the time I’m writing this, which fits within my 3GB of RAM. Also, it “only” uses 25 million parameters vs the 1.8 trillion (~72,000 times more) that we would have in mainstream models like ChatGPT 4.5.

However, as you will come to find later, I made a crucial mistake in going for that model without further ado.

Now we create our Avalonia app

Switching format for you to follow along if you wish to. This is going to get a bit more tutorially. We will create a basic Avalonia app similar to my DummyCounter. It will be a simple one, all we need is a text input, a text box to show the result of our sentiment detection and a button to trigger said detection.

The following assumes that you know how to use a terminal and have .NET 9 installed. .NET 9 and the Avalonia framework are cross-platform so you be able to replicate these steps regardless of what machine you’re using.

First, we will create a new .NET solution within a directory named mobile-sentiment-detector` with the command below:

dotnet new sln -n MobileSentimentDetector -o mobile-sentiment-detector

Next, we will install Avalonia templates so that we can easily create an Avalonia project with the boilerplate for a new app:

dotnet new install Avalonia.Templates

Now that our templates are installed, we can create a new Avalonia app project and link it to our solution.

dotnet new avalonia.app -o MobileSentimentDetector
dotnet sln add .\MobileSentimentDetector\

From here, we can validate that our boilerplate setup works by running our project:

dotnet run --project .\MobileSentimentDetector\

Avalonia setup working on .NET 9.0 — Our boilerplate app works

Now, I will update this to match what I want to build. I thought about vibe coding to get my desired interface, but then remembered that I could just go back to my DummyCounter code and tweak from there using the available Avalonia doc.

After leaving this alone for a whole week, I used a Sunday afternoon to write that basic UI that would take an input and update the sentiment placeholder based on that. Here is the code I used to replace what went into the original MainWindow.axaml file:

<Window xmlns="https://github.com/avaloniaui"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d" d:DesignWidth="400" d:DesignHeight="800"
    xmlns:vm="using:MobileSentimentDetector.ViewModels"
    x:Class="MobileSentimentDetector.MainWindow"
    Title="MobileSentimentDetector"
    Background="#2B2E5F">
    <Design.DataContext>
        <vm:MainWindowViewModel />
    </Design.DataContext>

    <DockPanel LastChildFill="True">
        <DockPanel LastChildFill="True" Width="NaN" DockPanel.Dock="Top">
            <TextBlock HorizontalAlignment="Center" VerticalAlignment="Top" FontSize="26" Foreground="White"
                Margin="20">Mobile Sentiment Detector</TextBlock>
        </DockPanel>
        <DockPanel LastChildFill="True" Width="NaN" DockPanel.Dock="Bottom">
            <Button Height="44" Margin="20" Background="White" FontSize="20"
                HorizontalAlignment="Center" VerticalAlignment="Bottom" Click="DetectSentiment">Detect Sentiment</Button>
        </DockPanel>
        <StackPanel>
            <TextBlock x:Name="sentimentPlaceholder" Height="44" Margin="20" Foreground="White" FontSize="20"
                HorizontalAlignment="Center" VerticalAlignment="Center"  >Sentiment placeholder</TextBlock>
            <TextBox  x:Name="input" Margin="20" Height="200" AcceptsReturn="False" TextWrapping="Wrap"
                FontSize="18"
                Background="White" />
        </StackPanel>
    </DockPanel>
</Window>

And now the result after re-running my app.

Now that I have my UI set up, I can start integrating with the machine learning model.

BERT the Machine (learning model)

As mentioned previously, I picked Google’s MobileBERT machine learning model to build my sentiment detection app.

Before pulling the model, I asked Grok whether there was a way to make it even smaller for better performance. From there, I was told about quantisation and this prompted my interest to find out more.

After some googling, I learned that quantisation is a process that reduces the precision of a model’s weights and activations from a high-precision format like 32-bit floating points (FP32) down to lower precision formats like 8-bit integers (INT8).

Weights are parameters that control how inputs are combined. As for activations, these determine the output of a neuron after processing the weighted inputs. I did not look further into these, but I will to deepen my understanding of machine learning at a later time.

Back to quantisation, there are different types; however, the first one I read about felt appropriate. Post-Training Quantisation (PTQ) uses calibration data to minimise accuracy loss and is common for ONNX models.

This would require at least a subset of the sentiment detection data that I am interested in. However, this feels like a treat I should reserve for a future blog post. Beyond that, if I mess up the PTQ for my model, this would be a nightmare to debug with my now-limited knowledge.

Tokens

You probably heard of tokens in the machine learning context. “This model can process X tokens/second with a bazillion GPU” and so on. Tokens are the basic units of text that are used as inputs.

MobileBERT is a transformer model that uses Natural Language Processing (NLP) to understand my sentences by tokenising them. It comes with a vocabulary file, which I assume allows converting tokens into numbers through indexes and vice versa. If you open a vocabulary file, you will see various tokens like below:

Looking at the above, if you wrote the phrase “peasantively”, you would have 4 tokens:

[CLS], peasant, ##ively, [SEP]

[CLS] is the classification token, it seems to always be at the start of my tokenised inputs, so it must be a way to tell the model that whatever comes after this must be sorted into a class/category. [SEP] is used for any separator, like \n or a space.

These would be encoded into their vocabulary indexes before being fed into the model:

101, 14539, 14547, 102

In the context of this post, I would need to break my sentence into tokens that I can then feed to the MobileBERT model so that it can predict whether the sentiment of the phrase is negative or positive.

So far, I am under the impression that there is no one-size-fits-all tokeniser that I can apply to every machine learning model out there. I found that each model I looked at around NLP.

Since I do not feel like doing so myself, I asked Grok to find me a C# tokeniser that can break down my sentences into tokens that MobileBERT can understand. It led me to BertTokenizers on Github which supports tokenisation for different BERT models.

Slow progress; How it ended

Fortunately for you I will speed up this section and skip the boring attempts at making this work an hour every few days over the course of several weeks and skip to how it ended.

I ran into an issue with that BertTokenizers library where it was trying to load a vocabulary file that was missing, but only when using the Debug config. I fixed that by creating my own BERT tokeniser by copying and then adjusting the library so that I could tokenise my phrases using an embedded file resource. This allowed me to use the actual vocabulary file linked to the MobileBERT model instead of the one from the library.

Then I got an error trying to load the MobileBERT model in the application. It seemed like the latest version of ML.NET did not work with the model, which was using model operators not available before version 14 with a version 10 of the ONNX runtime. After a few hours, I figured that the ONNX runtime library was missing and manually added it to the project.

Now that the code works, I realise that the output I’m getting is an array of logits containing two numerical values. After some looking into, I realise that the first logit represents the probability of the sentiment being negative and the second one represents the probability of a positive sentiment. If the first is higher than the second, this means that my phrase is probably negative. I added some code to translate that.

Now that my model was running with human-readable outputs, I faced another challenge. Every single sentence was inferred as being negative. This is where the mistake I mentioned earlier comes into play. MobileBERT is a pre-training model, I still needed to train it for sentiment detection, which would make the model grow and take an unforeseen (and unwanted) amount of time. At this point, I looked for a different model and found distilbert/distilbert-base-uncased-finetuned-sst-2-english. You can find out more about that model on HuggingFace.

It did the job, now my sentences are accurately predicted, using a local machine learning model.

I then decided to try and export the app for Pinephone, and while the build and deployments succeeded, the app failed to start due to the lack of an ARM64 musl-compatible ONNX runtime library. Since I spent a fair amount of time on this, I decided to call it a post and start writing. Compiling the ONNX runtime for the musl version of the C standard library then running a sentiment detector would be a solid blog post on its own.

What have we learned so far?

Machine learning is an artificial intelligence-focused branch of computer science where the aim is to render machines able to perform tasks more and more autonomously. Machines can use a machine learning model to decide which task to perform based on a given input.

A machine learning model is a file built as a result of a data-driven training where inputs are matched with outputs. During the training, parameters are dynamically generated to predict outputs.

In the context of natural language processing, these parameters include weights and activation values derived from tokenised inputs. The tokenised inputs are created by encoding human language into tokens and converting these into indexes that are used to compute predictions.

We can reduce the size of a model through a process called quantisation. This can lead to a loss of precision while still performing predictions accurate enough for our needs.

And now we will wrap this up with a demo clip of my mobile sentiment detector on Windows. Enjoy and see you next time!

Source code available there: https://github.com/CodingNagger/mobile-sentiment-detector.