Start a new topic

Any way to add a virtual header or not ignore the first row?

One of the imports we get contains no header row, the first line is data. At the moment, we're opening the file, copying the first row, and pasting it as an extra row at the top.


But, since we're trying to simplify things as much as possible, I thought I'd check if there was any way to get IOM to not ignore/assume headers for the first row? Or add in a 'virtual' row in the import it'll treat as the headers, and thus treat the first line as data.


Guessing probably not, and it's simple enough to add the extra row. But, don't ask doesn't get ;-)


sounds interesting and should be doable - can you send me (Dennis@ladnierfamily.com);

1. a copy of an input file

2. what you want the headers to be

3. Is there a file naming convention or is the file name always the same


Thanks Dennis 

Note: this is NOT a solicitation for business or anything - simply an interesting question to review.

Hi John

 

If you're able to use the IOM API it would be easy to add a header row automatically at the beginning of the import. Exactly how would depend on what format your data is in. 


1 person likes this

The file comes in with a weird extension as it's from a bank, but the contents are essentially just a CSV, and we can change the file type before we import it.


Thanks!

John, I posted a response yesterday to do this for you but it seems my response was blocked - I assume because it had my email address.  Anyway, can you send me a copy of your file, what you want the header to be and any file naming standards/conventions.  dennis@ladnierfamily.com

As always, this starts simple but when you think it through it's a tiny bit more complicated. There are two ways to deal with this, one by simply copying the first row, and one by adding a "correct" header row. The problem with copying the first row is that the copied row will be used by IOM as a header row when creating the exceptions file. That's ugly. And it would be difficult to know if the header row of the original file or the exceptions file has already been duplicated. On the other hand, adding a "correct" header row means hard-coding that row somewhere. I'm leaning towards the correct header row because it doesn't falsely duplicate actual data, and because the process can test whether it's already been added.

 

All of which means that it's still pretty easy, but since I have "real" work to do I'll be doing this out of hours, and may not have the finished code for a couple of days. But I suspect you're not in a desperate hurry. 

For the record, Dennis' replies have just come through a day late, so hope I'm not treading on any toes, and no offence meant..


I would strongly advise against sending a copy of the file to anyone. This is information from a bank, so I doubt that randomly emailing it around would meet data processing standards whatever country you're based in.


This can be done without knowing the filename, contents or headers. the fact it's a .csv file is sufficient. More interesting though?

Aye; I got Dennis' email and sent a screenshot of what type of data was in what field, as I didn't want to share 'real' data for the reasons you suggest. I've got some code back from him that I'll be taking a look at, and can look at anything you put together alongside it.


You're also right that I'm not in a great hurry; we get these files intermittently, and the practice of just inserting a 'dummy' header row manually is a workable solution. But since we've automated a lot of our IOM processes, getting it to the point where I can just drop it in the import folder with the only edit being to make it CSV, and it'll be picked up and imported on the next run, would be nice.

So, here's my version. You can either include all this code in the "beforeimport" routine as it is here, or you can create a new subroutine and call it from the beforeimport.

 

You'll need a "headers.csv" file in the same folder as the import file. Having the headers in a separate file rather than hard-coded in the API makes it easier to adapt this if the file structure changes, or indeed if someone uses this for a completely different file. The headers.csv file should contain just one header row. For example, if you have three fields in the data file, the header file would look something like this:

 

"header 1","header 2","header 3"

 

This routine will add the headers to your import data and save the result as a new file with "withHeaders.csv" appended to the filename, so it won't actually overwrite your original data. It doesn't  matter if the original filename has a .csv extension or not. And it will always check whether the text string 'header 1' is in the first line, so it won't add multiple headers if you repeat an import, or have to import an exceptions file.

 

I'm not promising this is perfect, and it surely won't win any awards for being nicely coded, but it seems to do the job.

 

Enjoy

Nick

-------------------------------------------------------------


    Public Overrides Sub BeforeImport(ByVal Cancel As ImportOM.API.iCancel)

  'Import has not been started yet

 

 Const strHeaderFileName As String= "\headers.csv"

 Const strHeader1 As String = "header 1"

 Const strOutputFileSuffix As String= ".withHeaders.csv"

 Dim objStreamReader As system.io.StreamReader

    Dim strHeaders As String = ""

    Dim strFirstLine As String = ""

 Dim strFullText As String = ""

 Dim strOutput As String = ""

 Dim strOutPutFileName As String = ""

     'Pass the importfile to a StreamReader constructor.

     objStreamReader = New io.StreamReader(import.ImportFile)

     'Read the first line of text seperately.

     strFirstLine = objStreamReader.ReadLine

  strFullText = objStreamReader.ReadToEnd

     'Close the file.

     objStreamReader.Close()

     'only continue if the headers are not already present in the first line

  If instr(lcase(strFirstLine),strHeader1) <1 Then

   'get the header row from the header file

   If io.File.Exists(import.ImportFileFolder & strHeaderFileName)

 

    strHeaders = io.file.readalltext(import.ImportFileFolder & strHeaderFileName, system.text.encoding.default) & vbCrLf

    'compile the full text

    strOutput = strHeaders & strFirstLine & VbCrLf & strFullText

    'set the name of the new file to create

    strOutPutFileName = import.importFile & strOutputFileSuffix

    Dim counter As integer

    Counter = 0

    'if the routine has already been run for some reason, don't overwrite the previous version

    While io.file.exists(strOutputFileName)

     counter = counter + 1

     strOutputFileName = import.importFile & "(" & CStr(counter) & ")" & strOutputFileSuffix

    End while

    'write the new text to the new file

    IO.File.WriteAllText(strOutPutFileName, strOutput)

    'switch the import to use the new file

    import.ImportFile = strOutputFileName

   Else

    'if there's no headers.csv file, the first data row would not be Imported.

    cancel.cancelimport("Headers.csvwas not found. Please check and try again")

   End If 'if header file exists

  End If '

 End Sub

Have been out the last week or so - here is what I provided John a week or so back - you can technically do this in one-line of code however Omatic's use of VB does not support some of the more complex file I/O features;


Instructions;


Copy the below subroutine “Write_Header” into the code of your import profile (ex. launch code editor) – it can go anywhere but my practice is to put it near the top (see image) Insert this line into the subroutine that says ‘BeforeImport” (see image).  You need to change c:\py\1.csv to whatever your directory and filename is – it does not have to be named csv – just make it whatever it is you use.

Write_Header("c:\py\1.csv","Field,Field,Field,Field,Field,Field,Field,Amount,Gift Ref, Alias,Field,Field,Date")   

 

Code:

  Public  Sub Write_Header(FName As String,Fline As String)

 

  'setup a few variables and create the objects needed

  'note you need 'Imports System.IO

  Dim fso, ts

  Dim FArr() As String

  Dim i, ii As Integer

  fso = CreateObject("Scripting.FileSystemObject")

 

   'Open's the specified file

  ts = fso.OpenTextFile(FName)

  'Loop while not at the end of the file.

  i = 1

  Do While Not ts.AtEndOfStream

  ReDim FArr(i)

  FArr(i) = ts.ReadLine()

  i = i + 1

   Loop

  'Close the file.

  ts.Close

 

  're-create file

  ts = fso.CreateTextFile(FName, True)

  'write header line

  ts.writeline(Fline)

  'write rest of the file

  For ii = 1 To i - 1

  ts.writeline(FArr(ii))

  Next ii

  'close the file

  ts.close

Login to post a comment