AI Introduction
Artificial Intelligence (AI) is rapidly transforming societies, economies, and governance structures worldwide. As AI technologies become more deeply embedded in decision-making, service delivery, and knowledge production, understanding the terminology that underpins these systems is essential for educators, policymakers, technologists, and community leaders. This need is particularly necessary in contexts where data sovereignty, Indigenous rights, and tribal governance intersect with digital innovation.
The following glossary provides a comprehensive, alphabetized reference to foundational and specialized terms in AI, with an emphasis on concepts relevant to data sovereignty and tribal governance. Each entry is designed to be concise yet informative, suitable for educational materials, policy briefings, and cross-sectoral dialogue.
Introduction to AI (2024): What is Artificial Intelligence?
Introduction to Artificial Intelligence
Glossary
-
measures the proportion of correct predictions made by a model out of all predictions. It is a common metric for classification tasks, but can be misleading in imbalanced datasets.
-
is a step-by-step procedure or set of rules for solving a problem or performing a computation. In AI, algorithms are used to train models, process data, and make predictions. Examples include decision trees, support vector machines, and neural networks.
-
refers to computer systems or models capable of performing tasks that typically require human intelligence, such as reasoning, learning, perception, and decision-making. AI encompasses a broad range of technologies, from rule-based systems to advanced machine learning and deep learning models. AI systems can be designed to mimic human behaviors, process natural language, recognize patterns, and solve complex problems.
-
(area under the curve) measures a model's ability to distinguish between classes. A higher AUC indicates better performance in binary classification tasks.
-
is the tendency of humans to favor recommendations made by automated systems over non-automated information, even when the system is incorrect.
-
is a subset of the training data processed together in one iteration during model training. Batch size affects training speed and model convergence.
-
in AI refers to systematic errors or unfairness in model predictions, often arising from imbalanced or unrepresentative training data, flawed algorithms, or human prejudices. Bias can lead to discriminatory outcomes and undermine trust in AI systems.
-
is an AI system whose internal workings are not easily understood by humans, often due to complexity or lack of transparency. Many deep learning models are considered black boxes.
-
refers to running AI models on remote servers, leveraging scalable computational resources. Cloud deployment is suitable for complex models and centralized management but may introduce latency and privacy concerns.
-
involve actively engaging community members in the design, implementation, and evaluation of research or program initiatives. These methods prioritize local knowledge, cultural relevance, and empowerment.
-
is an approach that involves community members as equal partners in all stages of the research process, from design to dissemination. CBPR prioritizes mutual benefit, respect, and capacity building.
-
is an interdisciplinary field that enables computers to interpret and understand visual information from the world, such as images and videos. Computer vision tasks include image classification, object detection, image segmentation, and facial recognition. These technologies are integral to applications like autonomous vehicles, medical imaging, and surveillance systems.
-
are legal frameworks that protect personal information, including health and substance abuse data. These laws apply to all parties involved in data sharing agreements and research.
-
refers to the design and implementation of technology solutions that respect and incorporate the cultural values, traditions, and needs of the communities they serve. This approach ensures that digital tools and platforms are relevant, accessible, and empowering for Indigenous and other marginalized groups.
-
is research that respects Indigenous culture and values, is equitable, not researcher-centered, and is relevant to Indigenous ways of knowing. It emphasizes the importance of relationships, respect, and community benefit.
-
is the practice of storing data within the geographic boundaries of the community or nation to which it pertains. This ensures that data remains under the jurisdiction and control of the relevant authority.
-
is a specialized area within machine learning that uses multi-layered artificial neural networks to model complex patterns and relationships in data. Deep learning algorithms are particularly effective for tasks involving large, unstructured datasets such as images, audio, and text. These models can automatically extract features from raw data, reducing the need for manual feature engineering.
-
is a technique for protecting sensitive data in a dataset by adding statistical noise, making it difficult to identify individual data points. It is used to enhance privacy in AI training and data sharing.
-
occurs when AI decisions disproportionately affect different population subgroups, even if the system appears neutral.
-
involves explicitly factoring sensitive attributes into decision-making, leading to different treatment of subgroups.
-
is a technique for compressing a large, complex model (teacher) into a smaller, more efficient model (student) that approximates the original model's performance.
-
involves running AI models locally on devices (such as smartphones or IoT devices) rather than in the cloud. This approach reduces latency, enhances privacy, and enables offline functionality, but may be limited by device resources.
-
in AI is the principle of ensuring that AI systems do not discriminate against individuals or groups based on protected characteristics such as race, gender, or ethnicity. Fairness metrics include demographic parity, equal opportunity, and individual fairness.
-
refers to the process of transforming raw data (such as images) into a set of measurable characteristics (features) that can be used for further analysis or modeling. In computer vision, feature extraction is often automated by deep learning models.
-
is the process of further training a pre-trained model on a specific dataset or task to adapt it to particular requirements or domains.
-
is the ability of a trained model to perform well on unseen data, not just the training set. High generalization indicates a robust model.
-
refers to AI systems capable of creating new content, such as text, images, audio, or video, based on patterns learned from training data.
-
in AI refers to the generation of outputs that are factually incorrect or not grounded in the training data, often presented as plausible information by the model.
-
refers to biases present in the world that are reflected in datasets, often perpetuating stereotypes or inequalities.
-
is a configuration variable set before training a model, such as learning rate, batch size, or number of layers. Hyperparameter tuning is essential for optimizing model performance.
-
is the process of assigning a label to an entire image based on its content. For example, determining whether an image contains a cat or a dog. This is a foundational task in computer vision.
-
is the process of partitioning an image into multiple segments or regions, often at the pixel level, to identify objects or boundaries. Semantic segmentation assigns a class label to each pixel, enabling detailed scene understanding.
-
refers to the mechanisms, policies, and practices that enable Indigenous communities to make decisions about how data is collected, interpreted, accessed, stored, and used. It operationalizes data sovereignty and supports self-determination.
-
asserts the rights of Indigenous Peoples and nations to govern the collection, ownership, and application of data about their peoples, lands, and cultures. This is grounded in inherent sovereignty and is recognized in international frameworks such as the United Nations Declaration on the Rights of Indigenous Peoples (UNDRIP).
-
are formal arrangements between tribal governments and other governmental entities (such as states or counties) to coordinate policies, share resources, or collaborate on shared concerns. IGAs are essential for managing complex jurisdictional relationships and ensuring effective service delivery.
-
is the degree to which a human can understand the cause of a decision made by an AI model. Highly interpretable models, such as linear regression or decision trees, allow users to trace how inputs affect outputs.
-
refers to the legal authority of a government to make and enforce laws within a defined area. In tribal contexts, jurisdiction may be shared or contested among tribal, federal, and state governments.
-
is a deep learning model trained on vast amounts of text data to understand and generate human-like language for tasks like question answering, summarization, translation, and conversational agents.
-
is a subset of AI focused on developing algorithms that enable computers to learn from data and improve their performance over time without being explicitly programmed. ML models identify patterns in data and use these patterns to make predictions or decisions. ML is foundational to many modern AI applications, including recommendation systems, image recognition, and predictive analytics.
-
is a field of AI focused on enabling computers to understand, interpret, and generate human language. NLP powers applications such as chatbots, language translation, sentiment analysis, and speech recognition.
-
is the exercise of authority over digital infrastructure, including broadband networks and wireless spectrum, by Tribal Nations. It enables tribes to design, build, and govern their own digital futures.
-
is a computational model inspired by the structure of the human brain, consisting of interconnected layers of nodes (neurons). Neural networks are capable of learning complex, non-linear relationships between inputs and outputs. Deep neural networks, with multiple hidden layers, are the backbone of many state-of-the-art AI systems, including those used in speech and image recognition.
-
involves identifying and locating multiple objects within an image or video. Unlike image classification, object detection not only classifies objects but also provides their positions, typically using bounding boxes. Object detection is crucial for real-time applications such as autonomous driving and industrial automation.
-
occurs when a model learns the training data too closely, including noise and outliers, resulting in poor generalization to new data. Techniques to prevent overfitting include regularization, dropout, and cross-validation.
-
is the proportion of positive predictions that are actually correct. It is particularly important when the cost of false positives is high.
-
is the practice of designing and refining prompts to achieve desired outputs from generative AI systems.
-
(or Sensitivity) is the proportion of actual positives that are correctly identified by the model. It is crucial when missing positive cases is costly, such as in disease detection.
-
is a type of machine learning where an agent learns to make decisions by interacting with an environment and receiving feedback in the form of rewards or penalties. The agent's objective is to maximize cumulative rewards over time. RL is widely used in robotics, game playing, and autonomous systems.
-
is the right of Indigenous peoples to freely determine their political status and pursue their economic, social, and cultural development. It is a foundational principle in international law and Indigenous rights frameworks.
-
refers to the right of Tribal Nations to control and manage the electromagnetic radio frequencies used for wireless communication within their territories.
-
focuses on the strengths, knowledge, and capacities of Indigenous communities, rather than deficits or problems. It promotes empowerment, self-determination, and positive outcomes.
-
is a machine learning approach where models are trained on labeled datasets, meaning each input is paired with a known output. The model learns to map inputs to outputs and can then predict outcomes for new, unseen data. Supervised learning is commonly used for classification (e.g., spam detection) and regression (e.g., price prediction) tasks.
-
Relationship, Respect, Reciprocity, Relevance, and Responsibility - are values embedded in Indigenous research methodologies. They guide ethical engagement, knowledge sharing, and benefit communities.
-
consists of examples used to teach an AI model how to perform a task. The quality, diversity, and representativeness of training data are critical for model performance and generalization.
-
is the quality of being open and clear about how AI systems operate, including their data sources, algorithms, and decision-making processes. Transparency is vital for accountability and trust.
-
includes the technological systems and networks that support tribal governance, data sovereignty, and digital services. This infrastructure is foundational for enabling digital equity, cultural preservation, and economic development within tribal communities.
-
refers to the authority of Tribal Nations to control and manage digital infrastructure, data, and technologies in ways that support self-determination and protect community interests. It encompasses governance over broadband networks, data storage, digital codes, and the use of emerging technologies such as AI.
-
is the inherent authority of Indigenous tribes to govern themselves within the borders of a nation-state, such as the United States. It is rooted in pre-colonial nationhood and affirmed by treaties, constitutional provisions, and court decisions. Tribal sovereignty enables tribes to create laws, manage resources, operate courts, and negotiate with other governments.
-
involves training models on data without explicit labels. The goal is to discover hidden patterns, groupings, or structures within the data. Common applications include clustering (e.g., customer segmentation) and dimensionality reduction (e.g., principal component analysis).