The Laws of (Generative) AI

Presentation is from a seminar presented in March 2023.

Overview of discussed limitations:

“Problem 1:

Model Architecture has a hand in how the model learns. Zero Layer Transformers use different algorithms than One Layer or Two Layer Attention Only Models (example).

Problem 2:

The Mathematical Assumptions within the Model are not only different based on Model Architecture, but also experience vastly different phenomena- attention only models without MLP layers have a linear structure, meaning they can be broken apart and rearranged as chains of matrixes and get a similar outcome, however, models with sparse datapoints can create noisy non-linear filtering causing a superposition when there are more model features than there are dimensions.

Models that experience this can still run neural computations, but are much harder to explain their output. This means explaining model decision making can be as costly as building the model itself- reverse engineering.”

Find the presentation’s slides here.

Previous
Previous

Practical Challenges to Achieving Algorithmic Fairness in Industry (And What to Do About Them)

Next
Next

Knowing Your Customer through Analytics and ML