Linear Regression on a String

Octave, 29 26 24 20 bytes

@(s)s/[!!s;1:nnz(s)]

Try it Online!

We have the model

y= intercept *x^0 + slope * x
 = intercept * 1  + slope * x

Here y is the ASCII value of string s

To find parameters intercept and slope we can form the following equation:

s = [intercept slope] * [1 X]

so

[intercept slope] = s/[1 x]

!!s converts a string to a vector of ones with the same length as the string.
The vector of ones is used for estimation of the intercept.
1:nnz(s) is range of values from 1 to number of elements of the string used as x.

Previous answer

@(s)ols(s'+0,[!!s;1:nnz(s)]')

For test paste the following code into Octave Online

(@(s)ols(s'+0,[!!s;1:nnz(s)]'))('meta.codegolf.stackexchange.com')

A function that accepts a string as input and applies ordinary least squares estimation of model y = x*b + e

The first argument of ols is y that for it we transpose the string s and add with number 0 to get its ASCII code.


TI-Basic, 51 (+ 141) bytes

Strings are 1-based in TI-Basic.

Input Str1
seq(I,I,1,length(Str1->L1
32+seq(inString(Str2,sub(Str1,I,1)),I,1,length(Str1->L2
LinReg(ax+b)

Like the other example, this outputs the equation of the best fit line, in terms of X. Also, in Str2 you need to have this string, which is 141 bytes in TI-Basic:

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_abcdefghijklmnopqrstuvwxyz{|}~

The reason this cannot be a part of the program is because two characters in TI-Basic cannot be automatically added to a string. One is the STO-> arrow, but this is not a problem because it is not a part of ASCII. The other is the string literal ("), which can be stringified only by typing into a Y= equation and using Equ>String(.


R, 46 45 bytes

x=1:nchar(y<-scan(,""));lm(utf8ToInt(y)~x)$co

Reads input from stdin and for the given test case returns (one-indexed):

(Intercept)           x 
99.25161290  0.01451613