Machine Learning and Artificial Intelligence

Machine Learning (ML) is a subset of Artificial Intelligence (AI) that provides systems the ability to learn from experience without being explicitly programmed. It is based on the idea of building algorithms that can receive input data and use statistical analysis to predict an output while updating outputs as new data becomes available.

For example, an algorithm could be trained on a dataset of clinical histories and their associated diagnoses, then be able to predict the diagnosis for a new clinical history it has never seen before. Machine learning has numerous applications including recommendation systems, image recognition, and natural language processing.

AI is a broader concept related to anything that allows computers to behave like humans. While machine learning is one approach to achieve AI, there are others such as rule-based systems. AI's goal is to make machines intelligent, where an intelligent machine is a flexible rational agent that perceives its environment and takes actions to maximize its chance of success.

Data Visualization
Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data.

For example, a histogram could be used to understand the distribution of a dataset, while a scatter plot could be used to understand the relationship between two variables.

Example Code

Machine Learning Example
We'll use the Iris Dataset, which is included in R's datasets package. The dataset includes measurements for 150 iris flowers from three different species.

# load dataset
data(iris)

# load library
library(class)

# create training and test dataset
set.seed(1234)
ind <- sample(2, nrow(iris), replace=TRUE, prob=c(0.8, 0.2))
iris.train <- iris[ind==1, 1:4]
iris.test <- iris[ind==2, 1:4]
iris.trainLabels <- iris[ind==1, 5]
iris.testLabels <- iris[ind==2, 5]

# KNN model
iris_pred <- knn(train = iris.train, test = iris.test, cl = iris.trainLabels, k=3)

# check accuracy
mean(iris_pred == iris.testLabels)


Data Visualization Example
Using ggplot2, we'll visualize the iris dataset to understand the distribution of sepal and petal sizes.

# load library
library(ggplot2)

# create a scatterplot
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
  geom_point(aes(color = Species, shape = Species)) +
  theme_minimal() +
  labs(title = "Iris Dataset", x = "sepal length", y = "sepal width")


As with Python, these examples in R illustrate the basic steps involved in machine learning and data visualization. R provides powerful tools for both machine learning and data visualization, and its data-handling capabilities make it an excellent choice for data science tasks.