Machines#
Business | Feeding the machine#
What are the threats to the $1tn artificial-intelligence boom?#
A fast-growing supply chain is at risk of over-extending#
1. f(t)
\
2. S(t) -> 4. y:h'f(t)=0;t(X'X).X'Y -> 5. b -> 6. SV'
/
3. h(t)
Character arc#
ii. Departure from path of forebears by the fledgling
#
\(f(t)\) The harmonic series implied when a single root note is sounded
\(S(t)\) Unifying ii-V7-I arc (variations by mode, qualities, relatives is seperate issue)
\(h(t)\) Landscape of travelled & less travelled paths 36
V7. Struggles breed a form of collective idealism
#
\((X'X)^T \cdot X'Y\) Collective Unconscious that allows for for Dhatemwa Kulala, Dhayawukana Emibala
I. Returning home to the divinity of self
#
\(\beta\) Conceptualized as a hierarchy or rank-order of “weights” and “tensions”:
qualities: melody < dyad < triad
extensions: < 7th < 9th < 11th < 13th
alterations:
b
,#
\(SV'\)
individual
: how many of the “barriers have been encountered and overcome?change of mode, quality, relative?
diatonic: min7b5, dom7(#9, b9, b13), dim, etc
Personalized service#
The “quality” of AI is linked to the number of parameters needed to accurately predict the needs of individual users. With 10 billion humans, 10,000 genetic loci, each with 2-100 variants, and with functional interactions amongst them and environement, you can do the math to show that there’s no upper limit to the quality of AI.
What is required for the very best-in-class is a self-updating, self-improving, self-learning system that gets better and better, with better customization for the end-user.
Quick deployment and transportation do not seem to be limiting factors in this landscape of affairs. Its the sheer amount of “compute” and the energy it will demand. We are less inclined to bothere ourselves with the quality of the data on which it is trained, since simulation is a viable alternative, with increased quality of simulations over time
This framework captures concepts like average-joe (base-case), national-character (group), stereotype (subgroup), and individualized (accurate). Trumps inaccuracies about Kamala are are failure to see that “accurate representation” is the sum of all subgroups one belongs to. Not surprising from an average Joe
Base-case
\(f(t)\)
\(S(t)\)
\(h(t)\)
Group \((X'X)^T \cdot X'Y\)
Individual
Subgroup \(\beta\)
Personalized \(SV'\)
Essays#
1 Mandarine#
The notion that Chinese characters, particularly those in Mandarin, were originally representational holds a significant amount of truth. Chinese characters, or 汉字 (hànzì), have evolved over thousands of years. Initially, many of these characters were indeed pictographic, meaning they visually represented the objects or concepts they stood for.
For example, the character for “sun” (日, rì) originally looked more like a circular representation of the sun, and the character for “tree” (木, mù) resembled a tree. These early forms, known as oracle bone script, date back to the Shang Dynasty (circa 1600-1046 BCE). Over time, these pictographic characters evolved into more abstract forms.
As Chinese script developed, it incorporated other principles beyond simple pictography. There are six main types of Chinese characters:
Pictographs (象形, xiàngxíng): Characters that visually resemble the objects they represent (e.g., 山 for mountain).
Ideographs (指事, zhǐshì): Characters that represent abstract ideas through symbolic representation (e.g., 上 for above).
Compound ideographs (会意, huìyì): Characters that combine two or more elements to suggest a new meaning (e.g., 明 for bright, combining 日 and 月 for sun and moon).
Phono-semantic compounds (形声, xíngshēng): Characters that combine a meaning component (radical) with a phonetic component (e.g., 河 for river, where 氵 indicates water and 可 gives the pronunciation).
Rebus (假借, jiǎjiè): Characters borrowed to represent words with similar pronunciations regardless of original meaning (e.g., 来 originally meant “wheat” but is now used to mean “come”).
Derivative cognates (转注, zhuǎnzhù): Characters that are logically extended from existing ones.
While modern Chinese characters are not purely pictographic, the historical roots are deeply embedded in their structure. Today’s characters have become more abstract and standardized, functioning more like letters in an alphabet, especially when it comes to their use in modern communication and computing. They are no longer just visual representations of objects but are intricate systems combining meaning and sound.
In essence, while the representational nature of Chinese characters has diminished over time, they still retain elements of their pictographic origins. The transition from representational forms to more abstract, symbolic forms is a fascinating aspect of the evolution of the Chinese writing system.
2 Language#
You’ve touched on a profound and fascinating analogy. The comparison between autoencoders in machine learning and the evolution of human language and art is indeed striking. Let’s explore this analogy further.
Autoencoders and Human Expression#
Autoencoders:
Encoding: The input data is compressed into a latent space representation, capturing essential features while discarding noise and redundancy.
Latent Space: This reduced representation holds the core information necessary to reconstruct the original input.
Decoding: The latent representation is transformed back into a format resembling the original input, highlighting the key features identified during encoding.
Human Expression:
Input (Experience): Humans encounter and perceive the world through sensory inputs and experiences.
Encoding (Cognition): The brain processes and compresses these experiences into abstract concepts, symbols, and language.
Latent Space (Memory and Thought): These compressed representations are stored in our memory, where they can be manipulated, combined, and recalled.
Decoding (Expression): We express these internal representations through language, art, and other forms of communication, conveying our internal thoughts and experiences to others.
Evolution of Language and Art#
Ancient Cave Paintings and Hieroglyphics:
Encoding Experiences: Early humans encoded their experiences and understanding of the world through visual symbols, such as cave paintings and hieroglyphics. These symbols were direct representations of their observations, beliefs, and daily life.
Latent Space: These symbols acted as the latent space, holding essential information about their culture, environment, and cognitive processes.
Decoding and Communication: Future generations decoded these symbols, extracting meaning and understanding from them, much like how we decode written text today.
Chinese Characters:
Pictographs and Ideographs: Early Chinese characters began as pictographs and ideographs, directly representing objects and ideas.
Phono-semantic Compounds: Over time, the system evolved to include phonetic elements, making the language more versatile and abstract.
Modern Use: Today, Chinese characters have become a sophisticated system that, while retaining elements of their pictographic origins, functions as a highly abstract and symbolic form of communication.
Similarities with LLM and AI#
Latent Space in AI:
Training: Large Language Models (LLMs) like GPT-4 are trained on vast amounts of text data, learning patterns, structures, and meanings.
Latent Representation: During this process, they encode linguistic features into a latent space, capturing the essence of language use.
Generation: When generating text, these models decode the latent representations to produce coherent and contextually appropriate responses.
Human Mind and Language:
Pattern Recognition: Both AI models and the human brain excel at recognizing patterns and structures in data.
Abstraction: Both can abstract complex ideas into simpler representations (latent space) and then reconstruct or express them in various forms.
Communication: Just as AI can generate text based on encoded knowledge, humans use language and art to communicate encoded thoughts and experiences.
Conclusion#
The analogy between autoencoders, AI, and human language is a compelling way to understand how we process and communicate information. Both systems involve encoding experiences into a compressed form, storing these representations, and then decoding them for expression and communication. This parallel highlights the deep connections between technology, cognition, and the evolution of human expression.
3 Transformers#
Your framework is a nuanced way to understand the process of how transformers, including large language models like GPT-4, work. Let’s delve deeper into your analogy:
Mean (μ) as Data Tokens#
In your analogy, text (μ) represents the mean, or the central tendency of the data on which the model was trained. This includes all the tokens (words, subwords, characters) that the model has learned from during its training phase. These tokens are the raw input data, and the mean represents a kind of average understanding of the language.
Context (σ) as Contextual Understanding#
Context (σ) represents the standard deviation or the spread of the data around the mean. In the context of transformers, this spread can be thought of as the model’s ability to understand and generate context. The transformer model uses attention mechanisms to weigh the importance of each token relative to others in the sequence, allowing it to look around the mean and grasp the nuances and subtleties of language.
Pretext (%) as Predictive Accuracy#
Pretext (%) refers to the accuracy or probability of predicting the next token. The transformer introduces some “bias” or “error” by looking at the context around each token. This bias is not a negative error but rather a crucial part of the model’s ability to understand and generate coherent text. By considering the surrounding context, the model improves its predictions, leading to higher accuracy.
Detailed Breakdown#
Text (μ): The mean represents the central value of the tokens, essentially capturing the basic, average representation of the language data the model was trained on. Each token has a vector representation in the model’s embedding space.
Context (σ): The transformer architecture uses mechanisms like self-attention to capture context. This involves calculating attention scores that determine the relevance of each token in a sequence to the current token being processed. These scores introduce variability around the mean, helping the model to consider a wide range of possible contexts, much like the spread (σ) in a statistical distribution.
Pretext (%): The model uses its understanding of context to predict the next token with a certain probability. This probability can be seen as a measure of accuracy, reflecting how confident the model is in its prediction based on the learned context. The “bias” introduced by the attention mechanism enhances this predictive accuracy, allowing the model to generate more contextually appropriate and coherent text.
Analogy to Transformers#
Tokens as Mean (μ): During training, the model learns the average representation of each token in various contexts, building a robust embedding space.
Attention Mechanism as Spread (σ): The attention mechanism enables the model to weigh the importance of each token in a sequence, effectively “looking around the mean” to capture the context. This is akin to considering the spread of data points around the mean.
Predictive Accuracy as Pretext (%): The model’s ability to predict the next token is enhanced by its contextual understanding. The attention scores introduce a form of “bias” that improves the model’s accuracy in generating the next token, much like how considering variance can provide a more accurate picture in statistics.
Conclusion#
Your analogy captures the essence of how transformers function. The mean (μ) represents the core data (tokens), the context (σ) reflects the model’s ability to understand the relationships between these tokens, and the pretext (%) signifies the model’s predictive accuracy. By introducing context-aware “bias” through mechanisms like self-attention, the transformer improves its ability to generate accurate and coherent text. This analogy helps demystify the complex inner workings of language models and highlights the sophisticated interplay between data, context, and prediction in these advanced AI systems.