Capítulo26 Nubes de palabra en R

## [1] "2024-11-07"

26.1 Ejemplo #1

Para generar nubes de palabras con R. Se necesita los paquetes wordcloud, RColorBrewer y wordcloud2

Ese ejemplo es una copia de la siguiente pagina de web

if (!require("pacman")) install.packages("pacman")
pacman::p_load(tm, SnowballC, wordcloud, RColorBrewer, wordcloud2)

library(wordcloud) # Un paquete para hacer word cloud
library(wordcloud2) # paquete más sencillo para hacer word cloud2
library(RColorBrewer) # paquete para cambiar los colores 
library(tm)  # paquete de text mining
library(SnowballC) # paquete para trabajar en otro idioma aparte de ingles

26.2 Usando los datos en el paquete wordcloud2 que se llama demoFreq

head(demoFreq, n=10)
##          word freq
## oil       oil   85
## said     said   73
## prices prices   48
## opec     opec   42
## mln       mln   31
## the       the   26
## last     last   24
## bpd       bpd   23
## dlrs     dlrs   23
## crude   crude   21
wordcloud2(data = demoFreq)

26.3 Paso 1

Importar los datos de la web

filePath <- ""
text <- readLines(filePath)
##  [1] ""                                                                                                                                                                                                                                                                                                                                                                                                              
##  [2] "And so even though we face the difficulties of today and tomorrow, I still have a dream. It is a dream deeply rooted in the American dream."                                                                                                                                                                                                                                                                   
##  [3] " "                                                                                                                                                                                                                                                                                                                                                                                                             
##  [4] "I have a dream that one day this nation will rise up and live out the true meaning of its creed:"                                                                                                                                                                                                                                                                                                              
##  [5] " "                                                                                                                                                                                                                                                                                                                                                                                                             
##  [6] "We hold these truths to be self-evident, that all men are created equal."                                                                                                                                                                                                                                                                                                                                      
##  [7] " "                                                                                                                                                                                                                                                                                                                                                                                                             
##  [8] "I have a dream that one day on the red hills of Georgia, the sons of former slaves and the sons of former slave owners will be able to sit down together at the table of brotherhood."                                                                                                                                                                                                                         
##  [9] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [10] "I have a dream that one day even the state of Mississippi, a state sweltering with the heat of injustice, sweltering with the heat of oppression, will be transformed into an oasis of freedom and justice."                                                                                                                                                                                                   
## [11] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [12] "I have a dream that my four little children will one day live in a nation where they will not be judged by the color of their skin but by the content of their character."                                                                                                                                                                                                                                     
## [13] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [14] "I have a dream today!"                                                                                                                                                                                                                                                                                                                                                                                         
## [15] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [16] "I have a dream that one day, down in Alabama, with its vicious racists, with its governor having his lips dripping with the words of interposition and nullification, one day right there in Alabama little black boys and black girls will be able to join hands with little white boys and white girls as sisters and brothers."                                                                             
## [17] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [18] "I have a dream today!"                                                                                                                                                                                                                                                                                                                                                                                         
## [19] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [20] "I have a dream that one day every valley shall be exalted, and every hill and mountain shall be made low, the rough places will be made plain, and the crooked places will be made straight; and the glory of the Lord shall be revealed and all flesh shall see it together."                                                                                                                                 
## [21] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [22] "This is our hope, and this is the faith that I go back to the South with."                                                                                                                                                                                                                                                                                                                                     
## [23] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [24] "With this faith, we will be able to hew out of the mountain of despair a stone of hope. With this faith, we will be able to transform the jangling discords of our nation into a beautiful symphony of brotherhood. With this faith, we will be able to work together, to pray together, to struggle together, to go to jail together, to stand up for freedom together, knowing that we will be free one day."
## [25] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [26] "And this will be the day, this will be the day when all of God s children will be able to sing with new meaning:"                                                                                                                                                                                                                                                                                              
## [27] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [28] "My country  tis of thee, sweet land of liberty, of thee I sing."                                                                                                                                                                                                                                                                                                                                               
## [29] "Land where my fathers died, land of the Pilgrim s pride,"                                                                                                                                                                                                                                                                                                                                                      
## [30] "From every mountainside, let freedom ring!"                                                                                                                                                                                                                                                                                                                                                                    
## [31] "And if America is to be a great nation, this must become true."                                                                                                                                                                                                                                                                                                                                                
## [32] "And so let freedom ring from the prodigious hilltops of New Hampshire."                                                                                                                                                                                                                                                                                                                                        
## [33] "Let freedom ring from the mighty mountains of New York."                                                                                                                                                                                                                                                                                                                                                       
## [34] "Let freedom ring from the heightening Alleghenies of Pennsylvania."                                                                                                                                                                                                                                                                                                                                            
## [35] "Let freedom ring from the snow-capped Rockies of Colorado."                                                                                                                                                                                                                                                                                                                                                    
## [36] "Let freedom ring from the curvaceous slopes of California."                                                                                                                                                                                                                                                                                                                                                    
## [37] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [38] "But not only that:"                                                                                                                                                                                                                                                                                                                                                                                            
## [39] "Let freedom ring from Stone Mountain of Georgia."                                                                                                                                                                                                                                                                                                                                                              
## [40] "Let freedom ring from Lookout Mountain of Tennessee."                                                                                                                                                                                                                                                                                                                                                          
## [41] "Let freedom ring from every hill and molehill of Mississippi."                                                                                                                                                                                                                                                                                                                                                 
## [42] "From every mountainside, let freedom ring."                                                                                                                                                                                                                                                                                                                                                                    
## [43] "And when this happens, when we allow freedom ring, when we let it ring from every village and every hamlet, from every state and every city, we will be able to speed up that day when all of God s children, black men and white men, Jews and Gentiles, Protestants and Catholics, will be able to join hands and sing in the words of the old Negro spiritual:"                                             
## [44] "Free at last! Free at last!"                                                                                                                                                                                                                                                                                                                                                                                   
## [45] " "                                                                                                                                                                                                                                                                                                                                                                                                             
## [46] "Thank God Almighty, we are free at last!"

Importar un texto de su computadora en formato .txt No va a funcionar el formato .doc de MSWord.

#mi_texto <- readLines(file.choose())

26.4 Subir el texto en formato Corpus

mi_texto=iconv(text,"WINDOWS-1252","UTF-8") # Use this for removing accents and non - english characters

# Load the data as a corpus
docs <- Corpus(VectorSource(mi_texto))

## <<SimpleCorpus>>
## Metadata:  corpus specific: 1, document level (indexed): 0
## Content:  documents: 46

26.5 Mirar el documento, para evaluar su contenido

# inspect(docs) # quita el hashtag para ver el documento

26.6 Transformar el texto para reemplazar algunos caracteres especiales, y remplazarlos por espacio en blanco

toSpace <- content_transformer(function (x , pattern ) gsub(pattern, " ", x))
docs <- tm_map(docs, toSpace, "-")
docs <- tm_map(docs, toSpace, "@")
docs <- tm_map(docs, toSpace, "\\|")

26.7 el paquete tm es para text mining

# Convert the text to lower case
docs <- tm_map(docs, content_transformer(tolower))

# Remove numbers
docs <- tm_map(docs, removeNumbers)

# Remove english common stopwords
stopwords("english") # list of common english stopwords that are often removed
##   [1] "i"          "me"         "my"         "myself"     "we"        
##   [6] "our"        "ours"       "ourselves"  "you"        "your"      
##  [11] "yours"      "yourself"   "yourselves" "he"         "him"       
##  [16] "his"        "himself"    "she"        "her"        "hers"      
##  [21] "herself"    "it"         "its"        "itself"     "they"      
##  [26] "them"       "their"      "theirs"     "themselves" "what"      
##  [31] "which"      "who"        "whom"       "this"       "that"      
##  [36] "these"      "those"      "am"         "is"         "are"       
##  [41] "was"        "were"       "be"         "been"       "being"     
##  [46] "have"       "has"        "had"        "having"     "do"        
##  [51] "does"       "did"        "doing"      "would"      "should"    
##  [56] "could"      "ought"      "i'm"        "you're"     "he's"      
##  [61] "she's"      "it's"       "we're"      "they're"    "i've"      
##  [66] "you've"     "we've"      "they've"    "i'd"        "you'd"     
##  [71] "he'd"       "she'd"      "we'd"       "they'd"     "i'll"      
##  [76] "you'll"     "he'll"      "she'll"     "we'll"      "they'll"   
##  [81] "isn't"      "aren't"     "wasn't"     "weren't"    "hasn't"    
##  [86] "haven't"    "hadn't"     "doesn't"    "don't"      "didn't"    
##  [91] "won't"      "wouldn't"   "shan't"     "shouldn't"  "can't"     
##  [96] "cannot"     "couldn't"   "mustn't"    "let's"      "that's"    
## [101] "who's"      "what's"     "here's"     "there's"    "when's"    
## [106] "where's"    "why's"      "how's"      "a"          "an"        
## [111] "the"        "and"        "but"        "if"         "or"        
## [116] "because"    "as"         "until"      "while"      "of"        
## [121] "at"         "by"         "for"        "with"       "about"     
## [126] "against"    "between"    "into"       "through"    "during"    
## [131] "before"     "after"      "above"      "below"      "to"        
## [136] "from"       "up"         "down"       "in"         "out"       
## [141] "on"         "off"        "over"       "under"      "again"     
## [146] "further"    "then"       "once"       "here"       "there"     
## [151] "when"       "where"      "why"        "how"        "all"       
## [156] "any"        "both"       "each"       "few"        "more"      
## [161] "most"       "other"      "some"       "such"       "no"        
## [166] "nor"        "not"        "only"       "own"        "same"      
## [171] "so"         "than"       "too"        "very"
docs <- tm_map(docs, removeWords, stopwords("english"))

# Remove your own stop word
# specify your stopwords as a character vector
docs <- tm_map(docs, removeWords, c("blabla1", "blabla2")) 

# Remove punctuations
docs <- tm_map(docs, removePunctuation)

# Eliminate extra white spaces
docs <- tm_map(docs, stripWhitespace)
# Text stemming
# docs <- tm_map(docs, stemDocument)

#stopwords() # Here are all the stopwords in the function **stopwords**

26.8 Crear una matriz de las palabras de mi documento

dtm <- TermDocumentMatrix(docs) # convirtir el texto en una lista de palabras
m <- as.matrix(dtm) # convertir en una matriz
v <- sort(rowSums(m),decreasing=TRUE) # ordenar las palabras por frecuencia
d <- data.frame(word = names(v),freq=v) # crea un nuevo data frame de las palabras y su frecuencia
head(d, n=10) # las primeras 10 palabras más comunes
##              word freq
## will         will   17
## freedom   freedom   13
## ring         ring   12
## dream       dream   11
## day           day   11
## let           let   11
## every       every    9
## one           one    8
## able         able    8
## together together    7

26.8.1 Check your words and remove unwanted words individually




26.8.2 How would you remove from the data frame all words that have less or equal to 3 counts

Del paquete wordcloud

wordcloud(words = d$word, # las palabras
          freq = d$freq, # la frecuencia
          min.freq = 10, # la frecuencia mínima
          max.words=200, # el número máximo de palabras
          random.order=TRUE, # orden aleatorio
          rot.per=0.35, # rotación de las palabras
          colors=brewer.pal(8, "Dark2")) # colores

26.9 Ejemplo 2

Del paquete wordcloud2

wordcloud2(data = d)

26.10 Como remover palabras de otra idioma

26.10.1 Vea este enlace para los “stopwords” de muchos idiomas

26.11 En español

# from CRAN

# Or get the development version from GitHub:
# install.packages("devtools")
# devtools::install_github("quanteda/stopwords")

26.11.1 Las 30 primeras palabras en la lista de stopword del paquete “stopwords” en español

head(stopwords::stopwords("es", source = "snowball"), 30)
##  [1] "de"     "la"     "que"    "el"     "en"     "y"      "a"      "los"   
##  [9] "del"    "se"     "las"    "por"    "un"     "para"   "con"    "no"    
## [17] "una"    "su"     "al"     "lo"     "como"   "más"    "pero"   "sus"   
## [25] "le"     "ya"     "o"      "este"   "sí"     "porque"

Ejemplos #3

Aqui un tercer ejemplo

# install.packages("pacman") # Si no tiene instalada la Biblioteca Pacman ejecutar esta línea de código

p_load("tm") # Biblioteca para realizar el preprocesado del texto,
p_load("tidyverse") # Biblioteca con funciones para manipular datos.
p_load("wordcloud") # Biblioteca para graficar nuestra nube de palabras.
p_load("RColorBrewer") # Biblioteca para seleccionar una paleta de colores de nuestra nube de palabras.

26.12 Un documento para hacer un word cloud “La intelegencia artifical”

articulo_IA <- ""
texto <- read_file(articulo_IA)

26.13 Convertir su documento en Corpus y identificar que es en español

texto2 <- VCorpus(VectorSource(texto), 
      readerControl = list(reader = readPlain, language = "es", load=TRUE))
## <<VCorpus>>
## Metadata:  corpus specific: 0, document level (indexed): 0
## Content:  documents: 1

26.14 Limpieza del documento

  • remover los números
  • remover las puntuaciones
  • cambiar a letras minúsculas
  • remover las palabras comunes en español
  • usar solamente la base de las palabres (“stem”) por ejemplo remover las conjugaciones espero, esparas, espera, esperamos… se convierte en “esper”
  • remover espacios blancos
texto2 <- tm_map(texto2, removeNumbers)
texto2 <- tm_map(texto2, removePunctuation)
texto2 <- tm_map(texto2, tolower)
texto2 <- tm_map(texto2, removeWords, stopwords::stopwords("es", source = "snowball"))
#texto2 <- tm_map(texto2, stemDocument, language="spanish")
texto2 <- tm_map(texto2, stripWhitespace)

26.15 Transformar el texto en un documento de texto sencillo (Plain Text)

texto2 <- tm_map(texto2, PlainTextDocument)

26.16 Transformar el texto para reemplazar algunos caracteres especiales

toSpace <- content_transformer(function (x , pattern ) gsub(pattern, " ", x))
#texto2 <- tm_map(texto2, toSpace, "-")
texto2 <- tm_map(texto2, toSpace, "@")
texto2 <- tm_map(texto2, toSpace, "\\|")

26.17 Crear una matriz de las palabras

tabla_frecuencia <- DocumentTermMatrix(texto2)

26.18 Calcular la frecuencia de de cada palabra

tabla_frecuencia <- cbind(palabras = tabla_frecuencia$dimnames$Terms, 
                          frecuencia = tabla_frecuencia$v)
# Convertimos los valores enlazados con cbind a un objeto dataframe.

# Forzamos a que la columna de frecuencia contenga valores numéricos.

# Ordenamos muestra tabla de frecuencias de acuerdo a sus valores numéricos.
tabla_frecuencia<-tabla_frecuencia[order(tabla_frecuencia$frecuencia, decreasing=TRUE),]

head(tabla_frecuencia) # aqui vemos las 6 palabras más comunes en el texto
##          palabras frecuencia
## 809  inteligencia         79
## 126    artificial         67
## 1378     sistemas         32
## 1347          ser         22
## 737        humano         19
## 1172    problemas         18
wordcloud(words = tabla_frecuencia$palabras, 
          freq = tabla_frecuencia$frecuencia,
          min.freq = 5, 
          max.words = 100, 
          random.order = FALSE, 
          colors = brewer.pal(8,"Paired"))

Como ver la figuras en la pestaña “Plots”

De esa forma puede bajar los WordClouds como .pdf o otro formato.
