View previous topic :: View next topic 
Author 
Message 
bafonso
Joined: 14 Jan 2009 Posts: 19

Posted: Mon Feb 02, 2009 4:24 am Post subject: Fit using data from columns 


Hi,
It would be very nice to be able to compute fits using arbitrary functions using data in colums. For example, one could ask for a fit to include data from a colum for each x. This simplifies the whole fitting problem hugely, not requiring to use other programs, or in my case, matlab(!).
Keep up the good work!
Last edited by bafonso on Mon Feb 02, 2009 12:24 pm; edited 1 time in total 

Back to top 


David Site Admin
Joined: 25 Nov 2006 Posts: 1961 Location: Chapel Hill, NC

Posted: Mon Feb 02, 2009 11:33 am Post subject: 


Like? There is an option  "Arbitrary"  that allows you to type in an expression in x. You can use multiple unknown parameters.
What function do you want to fit to the xy data? y(x) = ???(x,a,b,...)
David 

Back to top 


bafonso
Joined: 14 Jan 2009 Posts: 19

Posted: Mon Feb 02, 2009 12:22 pm Post subject: 


I guess I want to have information from an additional column. Basically, I am fitting to a x but I'd like to have also information from another x. I wasn't very clear on this, I'm sorry. I tried including it by typing the column name but it did not work
y = f(x1,x2) vs y = f(x). 

Back to top 


David Site Admin
Joined: 25 Nov 2006 Posts: 1961 Location: Chapel Hill, NC

Posted: Mon Feb 02, 2009 12:35 pm Post subject: 


You mean fitting a function of two arguments using one or more parameters?
In the simplest case this would be a linear regression of the form y = a*x1 + b*x2.
This is currently not possible using the fit command, and partly because I only draw things as a function of one argument (x). It is not that much harder to compute but it is harder to present the result graphically.
I have thought about this, but this is outside the bounds of what I plan for "DataGraph" since that application focuses on 2D (i.e. y as a function of x).
What is the structure of the fitting function?
David 

Back to top 


bafonso
Joined: 14 Jan 2009 Posts: 19

Posted: Mon Feb 02, 2009 12:50 pm Post subject: 


David wrote:  You mean fitting a function of two arguments using one or more parameters?
In the simplest case this would be a linear regression of the form y = a*x1 + b*x2.
This is currently not possible using the fit command, and partly because I only draw things as a function of one argument (x). It is not that much harder to compute but it is harder to present the result graphically.
I have thought about this, but this is outside the bounds of what I plan for "DataGraph" since that application focuses on 2D (i.e. y as a function of x).
What is the structure of the fitting function?
David 
Parameters is fine, it works nicely. The issue is having 1+ arguments. What do you mean to represent graphically? I still want a "2d" function that simply is a function of two "1d" arguments instead of just one. I do not want to create a function that spans over space. I just want it to take into account two discrete arguments.
Currently, it's an arbitrary function somewhere along the lines of y = a*x + c. The problem being that there's another critical element that changes over time that would be nice to have in order to get a better fit. I know I could create a new column that combines both the other two but I'd love to have the flexibility of not having to do so but sometimes it is not clear what relationship I will need to have between them.
Just a thought... I'm sure some people eventually could use it too 

Back to top 


Mike
Joined: 05 May 2008 Posts: 278 Location: Christchurch, New Zealand

Posted: Mon Feb 09, 2009 8:42 pm Post subject: 


Slow reply, but I just came across this.
I think this depends upon what you think DataGraph is. I view it as a tool to produce flexible and publication quality 2D graphs. i.e. it is focussed on presentation rather than data exploration and analysis (although I'm constantly impressed by how much power David has hidden away within it). That is, the data has been rearranged, processed and analysed before being imported into DataGraph. Although it has a very flexible univariate fitting mechanism, it doesn't offer the flexibility of a programming and analysis environment like Matlab. But what it does offer is effective for creating 95% of the figures one would find in a typical journal article. To do something more specialised is going to require a different tool.
If faced with a similar requirement, I would do the heavy lifting in another environment (e.g. you mention Matlab). i.e. do your fit based on the combination of x1 and x2 and then paste discrete values of it in as a column of fitted y values. Then plot the raw y as a function of x1 and then superimpose another plot representing the fitted y values from your externally calculated fit rather than asking DataGraph to calculate the function itself.
i.e. use DataGraph to depict the fit rather than calculate it. It can certainly do a better job of this than Matlab.
I know we all have our own ways of working, but I think this request probably falls on the wrong side of the 80/20 rule. 

Back to top 


bafonso
Joined: 14 Jan 2009 Posts: 19

Posted: Mon Feb 09, 2009 8:53 pm Post subject: 


Well, what you say is what everyone knows but since there is one subforum specifically dedicated to Requests for fitting, I thought I'd just try it
I see DG as a graphing program, but it never hurts to have features that speed up some processes
b 

Back to top 


David Site Admin
Joined: 25 Nov 2006 Posts: 1961 Location: Chapel Hill, NC

Posted: Tue Feb 10, 2009 12:03 am Post subject: 


Since this is inherently 2D (and in general, many more dimensional) I don't think I can bend DataGraph to do this. However, once you have the fit, I can easily compute the error at each point, and you can display that error in multiple ways in DataGraph.
I guess the issue is that the design of DataGraph is
#1  A structured drawing program
#2  Includes a lot of common standard statistical tools for one or two columns of numbers (one  histogram, 2  fit).
#3  Data mining, where you filter data, sort etc. (automatic = mask, manual = sorting, removing etc).
For this topic, what i have in mind when I say "Fit Function" is additional fits for (x,y) type data. I am open to allowing for additional columns for masking, weights etc. I can stuff a lot into the existing Fit drawing command, and am willing to look into creating a second fit drawing command if needed, but that would not be for fitting (x1,x2,...,xn,y) data but instead if you wanted drastically different fits such as distribution fits that can't be squeezed into the histogram command or wanted a more expert mode fit (and some of the complexity would then be offloaded from the standard fit command).
Regarding matlab, I recently added a feature that matlab users hopefully like. Do the following:
In matlab:
Create lists, say called x, y, z
save them into a file using
save filename x y z v4
(note the v4 is crucial)
In DataGraph
drag the file filename.mat onto the DataGraph table, or import the file using the menu option. If you want the data to overwrite existing columns, select them before you drag/import the matlab data file.
This also works from the command line. You can therefore create a matlab script, that saves the data into /tmp/data.mat, then launches dgraph by using the ! escape in matlab. This is a way to improve the graphics output of matlab
David 

Back to top 


myndsurfer
Joined: 30 Jun 2012 Posts: 4 Location: Auckland

Posted: Sat Jun 30, 2012 7:55 pm Post subject: 


bafonso wrote:  I guess I want to have information from an additional column. Basically, I am fitting to a x but I'd like to have also information from another x. I wasn't very clear on this, I'm sorry. I tried including it by typing the column name but it did not work
y = f(x1,x2) vs y = f(x). 
This is exactly the kind of thing I want to do. It seemed "obvious" to me that when I compared Datagraph with the reviews of other graph and curve fitting software that this is what Datagraph could do.
I'm somewhat disappointed to find that Datagraph currently does not fit y = f(x1,x2, ... xn).
I have daily temperature variations, measured approximately every hour, over several weeks.
Datgraph gives me a lovely fit using the 'Arbitrary Fit' feature:
y = a+b*sin(2*pi*(xd)) + f*x + g*x^2+ h*x^3
Revealing, fantastically awesomely:
12.925 + (7.7231)*sin(2*pi*(x  1.8556)) + (1.9178)*x + 0.27702*x^2 + (0.010511)*x^3
where
y = temperature in the cavity of my roof
x is the data and time of day expressed as a decimal day.time.
However I believe I can improve the fit by including data from another column, x2, which would be the per cent cloud cover (or humidity would be an x3 I could include).
In otherwords, I want to fit:
y = a+b*sin(2*pi*(xd)) + j*x2
where x2 is another column of data.
Or I might want to fit:
y = a+b*sin(2*pi*(xd))*x2
Or many others.... the value of Datagraph is the incredibly quick way I can visualise the goodness of fit of the equation forms I dream up.
Please implement this!
Some examples of the plots and fits I'm producing (after my first few hours of using Datagraph... I am so happy at the increase in productivity!
Here's an example I'm working on.
_________________ Peter J MELLALIEU
Auckland, New Zealand 

Back to top 


David Site Admin
Joined: 25 Nov 2006 Posts: 1961 Location: Chapel Hill, NC

Posted: Sat Jun 30, 2012 10:56 pm Post subject: 


I'm working on this in the latest beta. It's called "Multivariable Fit". The issue here is that it is a little hard to draw the fit on the graph, so what I'm doing is to draw the projections at each point. You can pick which projection it is. The hope is that you can tell from there where the fit is good and where it isn't. The results are displayed in the table.
It is not in the release version since I want users to beat up on it a little bit before I enable it in the release version.
Give it a spin and tell me what works/doesn't work. The beta now has a handy menu option to make it easier to send me email (in the Help menu).
The examples didn't come through since it is really a html page. You can use the URL tag instead. What should happen is that the multivariable fit function will be drawn as small vertical lines connecting the data point and the value of the fit at that point.
David 

Back to top 


myndsurfer
Joined: 30 Jun 2012 Posts: 4 Location: Auckland

Posted: Sun Jul 01, 2012 6:28 am Post subject: 


David wrote:  I'm working on this in the latest beta. It's called "Multivariable Fit". The issue here is that it is a little hard to draw the fit on the graph, so what I'm doing is to draw the projections at each point. You can pick which projection it is. The hope is that you can tell from there where the fit is good and where it isn't. The results are displayed in the table. 
That's great. Such a prompt response that I was delighted to receive whilst celebrating US Independence Day here on the other side of the International Date Line!
I'm downloading the beta now. I look forward to giving it a whirl tomorrow morning. (Again .. the first download failed for reasons unknown. Beta105 for Lion.... Should I quite and rename my existing Datagraph app?)
Here are links to the dropbox that include the various plots I'm making, and the DataGraph file I'm exploring:
https://www.dropbox.com/s/5nyh3ex550uugto/Roof%20cavity%20plot%202%20poly.jpg
https://www.dropbox.com/s/q5mprdu99dtkmez/Cavity%20v%20hour%20of%20day.jpg
And here is the DataGraph file in .zip format.
https://www.dropbox.com/s/zngryqzr5wzgms7/Roof%20cavity%20plot.dgraph.zip _________________ Peter J MELLALIEU
Auckland, New Zealand 

Back to top 


David Site Admin
Joined: 25 Nov 2006 Posts: 1961 Location: Chapel Hill, NC

Posted: Sun Jul 01, 2012 8:46 am Post subject: 


I don't know exactly what the best way to keep the App store version and the beta. The beta version is fully unlocked when you have the Mac App store version, and that is done through a file that needs to be renewed every few months.
One option is to remove the Mac App store version from the disk, and then when you want to go back you just go to the Purchased tab in the store and reinstall it. You can of course have two versions on the machine, and just launch the one you want. The multivariable fit is still missing some functionality and I will look at it later today to see if I can give it some attention and upload a new version.
What I think should work is as follows:
To run the beta:
 If you are running this on a different machine (e.g. a new laptop) install DataGraph from the app store and launch it once.
 Drag the DataGraph app into the trash and empty it.
 Download the beta
 Run it.
To switch back to the app store
 Delete the DataGraph beta
 Go to the Mac App store and install it from the Purchased tab
To renew the license that the beta checks to see if you have a App Store version. Shouldn't have to do this more than one or twice a year if you stay on the beta.
 Switch to the app store version as described above
 Switch to the beta version as described above.
If this doesn't work please let me know. The app store version will be updated periodically and at least once a year. I'm waiting for OS X 10.8 to come out to see if there is anything that comes up (and to see if a sandbox bug in OS X 10.7 has been fixed).
David 

Back to top 


myndsurfer
Joined: 30 Jun 2012 Posts: 4 Location: Auckland

Posted: Sun Jul 01, 2012 7:52 pm Post subject: 


David wrote:  I don't know exactly what the best way to keep the App store version and the beta. The beta version is fully unlocked when you have the Mac App store version, and that is done through a file that needs to be renewed every few months.

This procedure worked for me:
Close Datgraph
Rename current DataGraph.app to Datagraph_MAIN.app
Download Datgraph Beta
Run Datgraph Beta.(directly from the Downloads folder, without installing into .apps. folder)
Nothing horrible has happened. _________________ Peter J MELLALIEU
Auckland, New Zealand 

Back to top 


myndsurfer
Joined: 30 Jun 2012 Posts: 4 Location: Auckland

Posted: Sun Jul 01, 2012 8:10 pm Post subject: 


David wrote:  The multivariable fit is still missing some functionality and I will look at it later today to see if I can give it some attention and upload a new version.

I have had a quick explore with the new Mulitivariable fit command in the 3.01 beta.
First reactions:
I like the way you have implemented specifying the x variables, as x1, x2, etc for use in the Equation fit.
.. and the way of specifying which of the x variables will form the plot horizontal axis.
I MISS all the statistical output data: R2, and the Export Column' features of the existing Fit command
I LIKE the vertical lines drawn between the observed y and predicted y for each plot x point
I MISS seeing a continuous plot of the fit function. Though now, as I think about it (duh!) that's because the intermediate values for the nonaxis x1, x2, or x3 values are missing .. they would need to be interpolated from somewhere... so I see you have done all that seems possible,
So back to my drawing board...
Anyway, a most helpful start.
Top of my preferences for tweaking is the presentation of the R2 for the fit _________________ Peter J MELLALIEU
Auckland, New Zealand 

Back to top 


David Site Admin
Joined: 25 Nov 2006 Posts: 1961 Location: Chapel Hill, NC

Posted: Mon Jul 02, 2012 10:01 pm Post subject: 


I improved the Multivariable Fit command in the latest beta (just uploaded).
Added statistical variables R^2 and sigma. I also added the fit results and made them available from the token token field.
You can select the Multiple Fit command in the "From Command" column to extract the residual, fit values, ... Don't have a menu in the command yet to create the columns.
And drawing the function is not possible since the x1 value could be the same with different x2 values.
David 

Back to top 


