Rough notes on SQL Server bulk insert

(I recently found these notes that I had written up in January and Feburary of 2010. I am posting them in the hope that they may be a help to me and others, even in their rough form.)

It worked to just generate all the fields, with empty fields for ones I didn’t care to fill in.

for %d in (Foo Bar Baz Quux) do bcp MyDatabase.dbo.%d format nul -T -c -f %d.fmt

This resulted in the creation of Foo.fmt, Bar.fmt, Baz.fmt, and Quux.fmt files.


1       SQLCHAR       0       22      "\t"   1     FooID                            SQL_Latin1_General_CP1_CI_AS
2       SQLINT        0       4       "\t"   2     FooTypeID                        ""
3       SQLINT        1       4       "\t"   3     TitleID                            ""
4       SQLNCHAR      2       100     "\t"   4     LastName                           SQL_Latin1_General_CP1_CI_AS
5       SQLNCHAR      2       100     "\t"   5     FirstName                          SQL_Latin1_General_CP1_CI_AS
6       SQLNCHAR      2       100     "\t"   6     MiddleName                         SQL_Latin1_General_CP1_CI_AS
7       SQLNCHAR      2       100     "\t"   7     Department                         SQL_Latin1_General_CP1_CI_AS
8       SQLNCHAR      2       200     "\t"   8     Company                            SQL_Latin1_General_CP1_CI_AS
9       SQLCHAR       2       22      "\t"   9     OrgID                              SQL_Latin1_General_CP1_CI_AS
10      SQLTINYINT    0       1       "\r\n"   10    Active                             ""

Continue reading

Automation-assisted manual transformations

I had been unit testing an SQL generator and had a bunch of tests that gave various input to the generator and tested its output against expected SQL.   The SQL in my tests looked like this:

Now I was ready to feed the SQL to a database engine to verify that it was validly formed.  I would generally use grep for this type of task; but here my SQL statements were formatted multiline for easier reading, and grep operates in a line-by-line mode.  There were over 100 test cases, so it was worth figuring out an automated solution.  I also wanted to avoid writing a single-purpose text-processing utility if possible.

I ended up writing down the following steps for myself:

  1. Turn the tests into a single line of text for ease of working with tools
    On command line: cat Test*.cpp | tr -d "\n\r" > all-one-line.out
  2. Discard everything but the queries, inserting a newline after each
    In editor (SlickEdit for me), open all-one-line.out and Replace All (using Perl-style regexes):
    .*?("(?:\( )*SELECT[^;]+?;)
  3. Clean up what the regex didn’t
    Delete the last line
  4. Get rid of quotes
    Replace \" with nothing
  5. Get rid of semicolons
    Replace ;$ with nothing
  6. Get rid of extra spaces
    Replace <space>+ with <space>
  7. Save the file in the editor
  8. Get rid of Oracle-specific tests
    grep --invert-match TO_DATE < all-one-line.out > all-one-line.sql
  9. Let cool for 5 minutes before serving
    Paste all-one-line.sql into MS SQL Server Management Studio and execute (with Results to Text)

This may look like a large number of steps, but I got to where I could run through them in about 30 seconds and test all 130  queries on the server.  Nice!

Future improvements

Once I had the ability to test my test output against the database server, I wanted to do that each time the tests’ expected results changed.  So where I had originally envisioned a single smoke test run, I ended up going through these automation-assisted manual steps ten or twenty times.  In retrospect, the single-purpose utility script would clearly have been the better approach after all.  I need to get more comfortable whipping up such scripts to lower the barrier to writing them when these occasions arise.

Twelve years in, I would think I would be at the top of my craft by now, but there are still things that seem pretty basic that I’m learning.  Hmmm…I wonder if life really does begin at 40?