Tuesday, September 21, 2021

Calling [Compiled] Swift from R: Part 2

The earlier put up launched the subject of find out how to compile Swift code to be used in R utilizing a ineffective, toy instance. This one goes a bit additional and makes a case for why one would possibly need to do that by exhibiting find out how to use certainly one of Apple’s machine studying libraries, particularly the Pure Language one, specializing in extracting components of speech from textual content.

I made a parts-of-speech listing to maintain the code self-contained. In it are two recordsdata. The primary is partsofspeech.swift (swiftc appears to dislike dashes in names of library code and I dislike underscores):

NOTE: I didn’t change the @’s this time, so simply ignore the wrong Twitter hyperlinks it created.

import NaturalLanguage
import CoreML

extension Array the place Ingredient == String {
  var SEXP: SEXP? {
    let charVec = Rf_protect(Rf_allocVector(SEXPTYPE(STRSXP), depend))
    defer { Rf_unprotect(1) }
    for (idx, elem) in enumerated() { SET_STRING_ELT(charVec, idx, Rf_mkChar(elem)) }
    return(charVec)
  }
}

@_cdecl("part_of_speech")
public func part_of_speech(_ x: SEXP) -> SEXP {

  let textual content = String(cString: R_CHAR(STRING_ELT(x, 0)))
  let tagger = NLTagger(tagSchemes: [.lexicalClass])

  tagger.string = textual content

  let choices: NLTagger.Choices = [.omitPunctuation, .omitWhitespace]

  var txts = [String]()
  var tags = [String]()

  tagger.enumerateTags(in: textual content.startIndex..<textual content.endIndex, unit: .phrase, scheme: .lexicalClass, choices: choices) { tag, tokenRange in
    if let tag = tag {
      txts.append("(textual content[tokenRange])")
      tags.append("(tag.rawValue)")
    }
    return true
  }

  let loose = Rf_protect(Rf_allocVector(SEXPTYPE(VECSXP), 2))
  SET_VECTOR_ELT(out, 0, txts.SEXP)
  SET_VECTOR_ELT(out, 1, tags.SEXP)
  Rf_unprotect(1)

  return(out!)
}

The opposite is bridge code that appears to be the identical for each certainly one of these (or could possibly be) so I’ve simply named it swift-r-glue.h (it’s the identical because the bridge code within the earlier put up):

#outline USE_RINTERNALS

#embody <R.h>
#embody <Rinternals.h>

const char* R_CHAR(SEXP x);

Let’s stroll by way of the Swift code.

We have to two imports:

import NaturalLanguage
import CoreML

to utilize the NLP performance offered by Apple.

The next extension to the String Array class:

extension Array the place Ingredient == String {
  var SEXP: SEXP? {
    let charVec = Rf_protect(Rf_allocVector(SEXPTYPE(STRSXP), depend))
    defer { Rf_unprotect(1) }
    for (idx, elem) in enumerated() { SET_STRING_ELT(charVec, idx, Rf_mkChar(elem)) }
    return(charVec)
  }
}

will cut back the quantity of code we have to kind in a while to show Swift String Arrays to R character vectors.

The beginning of the operate:

@_cdecl
public func part_of_speech(_ x: SEXP) -> SEXP {

tells swiftc to make this a C-compatible name and notes that the operate takes one parameter (on this case, it’s anticipating a size 1 character vector) and returns an R-compatible worth (which might be a record that we’ll flip right into a knowledge.body in R only for brevity).

The next units up our inputs and outputs:

  let textual content = String(cString: R_CHAR(STRING_ELT(x, 0)))
  let tagger = NLTagger(tagSchemes: [.lexicalClass])

  tagger.string = textual content

  let choices: NLTagger.Choices = [.omitPunctuation, .omitWhitespace]

  var txts = [String]()
  var tags = [String]()

We convert the passed-in parameter to a Swift String, initialize the NLP tagger, and setup two arrays to carry the outcomes (sentence part in txts and the a part of speech that part is in tags).

The next code is generally straight from Apple and (inefficiently) populates the earlier two arrays:


tagger.enumerateTags(in: textual content.startIndex..<textual content.endIndex, unit: .phrase, scheme: .lexicalClass, choices: choices) { tag, tokenRange in if let tag = tag { txts.append("(textual content[tokenRange])") tags.append("(tag.rawValue)") } return true }

Lastly, we use the Swift-R bridge to make a record very similar to one would in C:


let loose = Rf_protect(Rf_allocVector(SEXPTYPE(VECSXP), 2)) SET_VECTOR_ELT(out, 0, txts.SEXP) SET_VECTOR_ELT(out, 1, tags.SEXP) Rf_unprotect(1) return(out!)

To get a shared library we will use from R, we simply must compile this like final time:

swiftc 
  -I /Library/Frameworks/R.framework/Headers 
  -F/Library/Frameworks 
  -framework R 
  -import-objc-header swift-r-glue.h 
  -emit-library 
  partsofspeech.swift

Let’s run that on some textual content! First, we’ll load the brand new shared library into R:

dyn.load("libpartsofspeech.dylib")

Subsequent, we’ll make a wrapper operate to keep away from messy .Name(…)s and to make a knowledge.body:

parts_of_speech <- operate(x) {
  res <- .Name("part_of_speech", x)  
  as.knowledge.body(stats::setNames(res, c("identify", "tag")))
}

Lastly, let’s do this on some textual content!

tibble::as_tibble(
  parts_of_speech(paste0(c(
"The comm wasn't working. Feeling more and more ridiculous, he pushed",
"the button for the 1MC channel a number of extra occasions. Nothing. He opened",
"his eyes and noticed that each one the lights on the panel had been out. Then he",
"rotated and noticed that the lights on the fridge and the",
"ovens had been out. It wasn’t simply the coffeemaker; the whole galley was",
"in open revolt. Holden regarded on the ship identify, Rocinante, newly",
"stenciled onto the galley wall, and stated, Child, why do you harm me",
"once I love you a lot?"
  ), collapse = " "))
)
## # A tibble: 92 x 2
##    identify         tag
##    <chr>        <chr>
##  1 The          Determiner
##  2 comm         Noun
##  3 was          Verb
##  4 n't          Adverb
##  5 working      Verb
##  6 Feeling      Verb
##  7 more and more Adverb
##  8 ridiculous   Adjective
##  9 he           Pronoun
## 10 pushed       Verb
## # … with 82 extra rows

FIN

If you happen to’re taking part in alongside at house, attempt including a operate to this Swift file that makes use of Apple’s entity tagger.

The subsequent installment of this subject might be find out how to wrap all this right into a bundle (then all these examples get tweaked and go into the tome.

*** It is a Safety Bloggers Community syndicated weblog from rud.is authored by hrbrmstr. Learn the unique put up at: https://rud.is/b/2021/01/24/calling-compiled-swift-from-r-part-2/

Latest news

Related news

LEAVE A REPLY

Please enter your comment!
Please enter your name here