Hi all,
I have been working on a project that involves importing data from a .csv flatfile into a database and then analyzing the data (both using Mivascript). I am wondering what approach some of you all might take in comparison with what I have done thus far.
The first version I wrote using Miva xbase database files. The method was to simply use mvimport and bring all the data in from the flatfile, scrubbing the data a bit as well on the import (to remove quotes and such). On a flatfile containing about 260,000 lines, this took ballpark ... maybe 10-15 mins for the import (on an office box using Miva Mia). Not bad. The other part of my script that runs the various analyses required another 14 mins or so to run. But, in the future this analysis part will become more complex.
So, then I used MySQL for the backend. The nice thing with MySQL is the analyses can run in about 4 seconds instead of 14 mins! But importing the data into MySQL using mvimport with INSERT queries takes about an hour or even more. Much, much slower. Interesting.
So overall, on the face of it it would seem importing data into Miva's native xbase DBFs is much faster than bringing it into MySQL. But I would really prefer MySQL for this because of the flexibility it gives me in accessing the data once it is there and the speed of querying the existing data. And as the analysis becomes more involved, the MySQL speed will be nice and in fact necessary.
There are also a couple of ways to bring the data straight into MySQL from either a Windows CL or the MySQL CL Client. These are a lot faster than INSERTS. I was thinking I could easily write a batch file. But then there issues with cleaning up the data first, and also with how to streamline this all so a regular user can do everything from a web browser. I really like it to be that simple.
What approaches have some of you all taken in situations like this?
Regards and Thanks for any input,
I have been working on a project that involves importing data from a .csv flatfile into a database and then analyzing the data (both using Mivascript). I am wondering what approach some of you all might take in comparison with what I have done thus far.
The first version I wrote using Miva xbase database files. The method was to simply use mvimport and bring all the data in from the flatfile, scrubbing the data a bit as well on the import (to remove quotes and such). On a flatfile containing about 260,000 lines, this took ballpark ... maybe 10-15 mins for the import (on an office box using Miva Mia). Not bad. The other part of my script that runs the various analyses required another 14 mins or so to run. But, in the future this analysis part will become more complex.
So, then I used MySQL for the backend. The nice thing with MySQL is the analyses can run in about 4 seconds instead of 14 mins! But importing the data into MySQL using mvimport with INSERT queries takes about an hour or even more. Much, much slower. Interesting.
So overall, on the face of it it would seem importing data into Miva's native xbase DBFs is much faster than bringing it into MySQL. But I would really prefer MySQL for this because of the flexibility it gives me in accessing the data once it is there and the speed of querying the existing data. And as the analysis becomes more involved, the MySQL speed will be nice and in fact necessary.
There are also a couple of ways to bring the data straight into MySQL from either a Windows CL or the MySQL CL Client. These are a lot faster than INSERTS. I was thinking I could easily write a batch file. But then there issues with cleaning up the data first, and also with how to streamline this all so a regular user can do everything from a web browser. I really like it to be that simple.
What approaches have some of you all taken in situations like this?
Regards and Thanks for any input,
Comment