almost there

2026-02-09 02:53:12 +01:00
parent 31ccc80aa4
commit 4565eefa3f
10 changed files with 3446 additions and 5 deletions
@@ -0,0 +1,10 @@
+@online{pavlic2026ga_hyperparameters,
+  author = {Ted Pavlic},
+  title = {Guide to Tuning the Many Hyperparameters of a Genetic Algorithm},
+  year = {2026},
+  month = feb,
+  day = {4},
+  url = {https://www.youtube.com/watch?v=TwZxTuU8LUI},
+  urldate = {2026-02-08},
+  note = {YouTube video}
+}
@@ -0,0 +1,105 @@
+#import "@preview/simple-ntnu-report:0.1.2": *
+
+#show: ntnu-report.with(
+  length: "short",
+  title: "Assignment 1 -- Report",
+  subtitle: "IT3708 -- Bio-Inspired Artificial Intelligence",
+  authors: ((name: "Fredrik Robertsen"),),
+  // front-image: image("ntnu.png", width: 6cm),
+  date: datetime(year: 2026, month: 2, day: 9),
+  language: "english",
+  bibfile: bibliography("ref.bib"),
+  bibstyle: "institute-of-electrical-and-electronics-engineers",
+  column-number: 2,
+  number-headings: false,
+  show-toc: true,
+)
+
+Source code available (code & report) at: \
+https://git.pvv.ntnu.no/frero-uni/IT3708
+
+= Introduction
+
+The problem in question is how we can solve the binary knapsack problem fast using approximation. Recall that this problem and its derivations like feature selection are NP-hard. This means we cannot obtain a solution to the problem in polynomial time (source). We can use a genetic algorithm to approximate a solution to the problem, tweaking hyperparameters to influence the results (source). This report details the process of solving the knapsack problem and in particular feature selection.
+
+= Background & Setup
+
+The binary knapsack problem attempts to find the smallest combination of items with a given weight and value to maximize the total value without exceeding the knapsack's total weight capacity. If we remove the capacity limit, we have a feature selection problem. This is relevant to regression and machine learning, where you feed input points along with an observed target value and attempt to find a mathematical model that fits it (source).
+
+Genetic algorithms approximate solutions to problems that attempt to find global optima in a search space, like these (source). Thus, we can encode the items or features as bits in a bitstring, representing the individual/chromosome of the population. Starting with some number of individuals, we can repeat these steps until satisfied:
+1. select parents from population;
+2. perform crossover with pairs of parents; then
+3. mutate offspring.
+
+I implemented a modular genetic algorithm where I can easily switch out the different operators and hyperparameters, i.e. how we choose parents and survivors or how we perform crossovers and mutations. This allows for easy testing and tuning. Written in over a thousand lines of low-level #text(blue)[#link("https://odin-lang.org/")[Odin]] code, it is fairly performant, especially thanks to memoizing the calculated fitness values. I use #text(blue)[#link("https://uiua.org/")[Uiua]] for plotting, because it was easy.
+
+The code repository has detailed instructions on how to use the code to achieve the following results.
+
+= Results & Reflection
+
+These are the findings.
+
+== Running the algorithm
+
+After successfully stitching together an implementation of a genetic algorithm, this is what the program would spit out:
+
+```
+Baseline RMSE: 0.1952
+Gen 0: Best=0.1885 Mean=0.1957 Worst=0.2062 Entropy=49.5620
+...
+Gen 99: Best=0.1920 Mean=0.1974 Worst=0.2088 Entropy=40.3545
+```
+
+Note this particular configuration ended up worse than it started off! That's because I ran it with random parent selection, 70% single-point crossover, 1% bit-flip mutations and full generational replacement. The main catch here is that selecting parents at random is not inducive to approaching any optima @pavlic2026ga_hyperparameters. It is however a very simple operator to implement.
+
+== Best and Worst RMSE
+
+On to the meat of this.
+
+From our first run above, we can see that the baseline RMSE obtained by calculating the fitness of all alleles activated, i.e. by selecting all features, is around 0.195. We wish to minimize, thus seek lower scores than this. Running on the same seed, meaning the same baseline, we obtain the following:
+
+#image("tournament_data.png")
+
+This was ran using tournament selection with $k=10$ participants and a population size of $mu = 1000$, crossover rate $P_C = 0.7$ and mutation rate $P_M = 0.01$. Switching to a roulette selection yields near identical results. No matter which of these hyperparameters I tune, the result is still near 0.1811 as shown in the graph.
+
+== Crowding
+
+Lackluster results bring us to better methods. We have seen that parent selection methods tournament- and roulette selection have strong enough selection pressure to force the population to converge in the first thirty or so generations. We can use crowding, which attempts to maintain diversity through niching. Here, this is done by deterministic and probabilistic crowding, both of which are examples of explicit approaches to maintaining diversity.
+
+Though sadly, I am unable to find any configuration that yields anything much better than 0.1811, even with these crowding methods. I might have implemented them wrong, or the seed I am using (42) may simply be really stuck at a certain optimum. Perhaps resource sharing could help spread out the population more.
+
+We can at least measure the entropies between a simple generational GA and a crowding-based one. The following is the entropy of generational replacement, where every parent is discarded in favor of the offspring.
+
+#image("tournament_entropy.png")
+
+Next is the probabilistic crowding entropy
+
+#image("probabilistic_entropy.png")
+
+Given more time, I could have plotted these in the same plot to more easily assess their differences. We can at least tell that probabilistic crowding maintains the entropy for a little longer, despite the high selection pressure from tournament selection.
+
+== Elitism
+
+The graphs from running with elitism versus crowding are quite similar to the previous graphs. Though I did stumble upon this entropy graph while running with probabilistic crowding and 10 elites with roulette parent selection. I also had to crank up the mutation rate to 0.02 and crossover to 0.8
+
+#image("lucky_entropy.png")
+
+== Comparing with the Knapsack Problem
+
+For the binary knapsack problem, we are maximizing.
+
+#image("gen_plot.png")
+
+This plot gets very close to the optimum (295246) using deterministic crowding. Then there's probabilistic crowding:
+
+#image("prob_plot.png")
+
+There is a bug in my fitness penalty calculation, causing vastly negative fitness values.
+
+= Conclusion
+
+It is not that easy tweaking these hyperparameters. You have to balance the forces of selection pressure and diversity. This vaguely reminds me of Heisenberg's uncertainty principle, that a particle becomes "fuzzy" if you know how it moves, but "fixed" if you don't. I'm no physicist, though.
+
+= Further work
+
+It is likely that some of these implementations are flawed, either in regards to correctness or performance or both. These issues could get ironed out. Additionally, like I mentioned already, drawing multiple plots from different configurations in the same plot could help compare. You could also script some automation of tweaking parameters and plotting/logging the data. Lastly, you could also modularize the code even more and make the algorithm more generic, allowing for even greater flexibility.
@@ -5,20 +5,20 @@ import "core:container/bit_array"
 Chromosome :: ^bit_array.Bit_Array
 Population :: [POPULATION_SIZE]Chromosome

-PROBLEM_TYPE :: "feature_selection"
+PROBLEM_TYPE :: "knapsack"

 GENERATIONS :: 100
-POPULATION_SIZE :: 100
+POPULATION_SIZE :: 1000
 ELITISM_COUNT :: 0
 SKEW :: 0
-TOURNAMENT_SIZE :: 5
+TOURNAMENT_SIZE :: 10
 CROSSOVER_RATE :: 0.7
 MUTATION_RATE :: 0.01

-PARENT_SELECTION_POLICY :: random_selection
+PARENT_SELECTION_POLICY :: tournament_selection
 CROSSOVER_POLICY :: single_point_crossover
 MUTATION_POLICY :: bit_flip_mutation
-SURVIVOR_SELECTION_POLICY :: generational_replacement
+SURVIVOR_SELECTION_POLICY :: probabilistic_crowding

 RANDOM_SEED :: u64(42)
 OUTPUT_FILE :: "output/data.csv"