Data conversion example in Perl

This is a one-off to convert Dinesh Prabhu’s Romanized Urdu-English dictionary into comma separated value format (.CSV) to be opened in Microsoft Excel and then pasted into the Microsoft Excel workbook included with Günther’s Vocabulary Cards for Palm.

open (INPUT, "< $ARGV[0]") || die "Can't open file $ARGV[0].\n";

while () {

	if ($_ =~ /([^\:]+)\:\s*([PATHSG])\s+(n.f.|n.m.|pron.|adj.|v.|adv.|intj.|prep.|suff.|pref.)\s+(.*)/)
	{
		($urdu, $origin, $part, $english) = ($1, $2, $3, $4);
		print "\"$urdu \[$part\] \[$origin\]\",\"\",\"\",\"\",\"$english\"\n";
	}

}
close(INPUT)
Advertisements

Batch files to convert mp3 podcasts to space-saving spx files

If you Google for something to convert mp3 to speex format, you won’t find much. I couldn’t find a method on the internet to automatically and easily convert them, so maybe my batch files below are a first of sorts. It has worked for me with Windows 2000 and Windows Vista, however it does have some issues with certain characters in podcast names, such as ®, which I can fix if there is any interest.

Why would you want to convert one lossy audio format to another lossy audio format, with the resulting noticeable loss of quality that results? Because podcasts aren’t really available in .wav format, and .spx format is more efficient than .mp3 for the spoken word and saves a little space on my Palm Tungsten T3’s SD card.

If you want to get this working yourself, you’ll need the Juice podcast receiver, ffmpeg Windows binaries and speexenc (look for the speexw installer):
I will assume you extract ffmpeg to C:\TOOLS, and install speex to the default C:\SPEEXW, and that you copy and paste the batch files into Notepad and save to C:\BAT. Make sure that you append ;C:\TOOLS;C:\SPEEXW;C:\BAT to your existing PATH environment variable. You will also have to make sure the folders C:\PODCASTS and C:\PDA2GO exist.

Look for ipodder.cfg. On Vista, it should be in C:\Users\YOUR LOGIN NAME\AppData\Roaming\iPodder. Change the download directory to C:\\PODCASTS (double backslashes intentional) which is used in the batch files below. (In Windows 2000 or XP, you can also change this in the GUI, but in Vista you can’t even open the GUI until you change it because the line refers to My Documents which in Vista has changed to simply Documents.)

From Juice, click File > Preferences, go to the Advanced tab, and check the box which says “Run this command after each download” and type the following in the text box below it:
podwork "%f" "c:\pda2go"

If you are running Windows Vista, you will need to go to properties of Juice.exe and set it to run in Windows XP SP2 compatibility mode (this is a requirement of being able to run Juice 2.2 on Vista, not a requirement of podwork.cmd):

The podwork.cmd script below converts a single file from mp3 to speex format. You will want to experiment with the quality variable (10 is the maximum, and at 7 the lesser quality is noticeable).

podwork.cmd

@echo off
rem podwork.cmd
rem thanks to 1CMDFAQ.TXT by Prof. Timo Salmi for
rem bringing my DOS skills into the 21st Century :)
rem See http://www.netikka.net/tsneti/info/tscmd.htm
setlocal enableextensions
set quality=7

rem set pod to your source folder (no trailing slash)
set pod=c:\podcasts

if [%2]==[] goto :EOF
if [%tmp%]==[] goto :EOF

set ptmp=%tmp%\
set ptmp=%ptmp:\\=\%
set ptmp=%ptmp:~0,-1%

if not exist "%ptmp%\*.*" goto :EOF

set pfol=%~dp1
set pfol=%pfol%\
set pfol=%pfol:\\=\%
set pfol=%pfol:~0,-1%
for %%a in ("%pfol%") do set pfol=%%~nxa

set tful=%~2
set tful=%tful%\
set tful=%tful:~0,-1%

if not exist "%tful%\*.*" goto :EOF
for %%a in ("%tful%") do set tfol=%%~nxa

rem Make necessary tree structures
xcopy "%pod%" "%ptmp%\podwork\" /t
xcopy "%pod%" "%tful%" /t

rem Skip if Speex file already exists
if exist "%tful%\%pfol%\%~n1.spx" goto :EOF

@echo on
ffmpeg -i "%~1" "%ptmp%\podwork\%pfol%\%~n1.wav"
if not exist "%ptmp%\podwork\%pfol%\%~n1.wav" goto :EOF
speexenc --quality %quality% "%ptmp%\podwork\%pfol%\%~n1.wav" "%tful%\%pfol%\%~n1.spx"
@echo off
endlocal


IMPORTANT NOTE: WordPress.com gets confused by double hyphens: Line 44 should start off “speexenc –quality” not “speexenc —quality” or “speexenc -quality”.

That works for new podcast downloads, but since I had downloaded some podcasts already, I wrote a quick podsweep.cmd script to run podwork.cmd for each pre-existing mp3 file. Make sure Juice is closed before running it. It is hard-coded for what I use as my Juice top-level download folder and my output folder:

podsweep.cmd

cd \podcasts
sweep.cmd for %%f in (*.mp3) do podwork "%%~ff" "c:\pda2go"

podsweep.cmd depends on sweep.cmd to run the command in every directory. With sweep.cmd, I mimic an ancient MS-DOS tool from the 80’s called SWEEP.COM:

sweep.cmd

@echo off
setlocal
oldcd=%cd%
for /R %%D in (.) do (
  cd "%%D"
  %*
)
cd /d %cd%
endlocal

Thanks to Chris Pirillo for the information on getting Juice to work in Windows Vista.

Drivercab Helper (AutoIT3 example)

This is a utility I wrote that works in conjunction with HFSLIP to slipstream additional drivers into a Windows 2000 or XP install. My announcement thread on MSFN has instructions and a screen shot. It is written in AutoIT3, which is similar to BASIC. The GUI work was done in Koda form designer.

#Region ;**** Directives created by AutoIt3Wrapper_GUI ****
#AutoIt3Wrapper_UseUpx=n
#EndRegion ;**** Directives created by AutoIt3Wrapper_GUI ****
; Drivercab Helper
; Version 1.2
; by David Eason

; Thanks to (in no order)
; TommyP and Tomcat76 for HFSLIP
; Tim Fehlman for sample code described in "Folder Recursion in Autoit" at http://dailycupoftech.com/folder-recursion-in-autoit/
; Lookfar, LazyCat, and JosBe for Koda Form Designer for AutoIt
; Jon Bennett for Autoit3 and adding Unicode support for version v3.2.4.0

; Requires Autoit v3.2.4.0 build to compile due to Unicode support

; History
; 1.1 First  version ;)
; 1.2 Added support for Unicode .INF files

#include 
#include 

If @AutoItVersion < "3.2.4.0" Then
	MsgBox(0, "Version Check", "This AutoIt script requires v.3.2.4.0 or later")
	Exit 1
EndIf

If @Unicode <> 1 Then
	MsgBox(0, "Unicode Check", "This AutoIt script requires the Unicode build")
	Exit 2
EndIf

; Globals
Dim $SourceFolder, $DestFolder, $StopProcessingFiles, $dir

#Region ### START Koda GUI section ### Form=d:\drivercabhelper\aform1.kxf
$Form1_1 = GUICreate("Drivercab Helper 1.1 for HFSLIP", 591, 507, 229, 78)
$Group1 = GUICtrlCreateGroup("Log", 13, 189, 560, 299)
$Edit1 = GUICtrlCreateEdit("", 26, 209, 534, 267, BitOR($ES_AUTOVSCROLL, $ES_AUTOHSCROLL, $ES_READONLY, $ES_WANTRETURN, $WS_HSCROLL, $WS_VSCROLL))
GUICtrlSetData(-1, "")
GUICtrlCreateGroup("", -99, -99, 1, 1)
$Group2 = GUICtrlCreateGroup("&Folders", 13, 20, 560, 117)
$Input1 = GUICtrlCreateInput("", 137, 40, 361, 21)
GUICtrlSetTip(-1, "Specify the folder containing existing INF files and driver files")
$Button1 = GUICtrlCreateButton("...", 512, 40, 36, 20, 0)
$Label1 = GUICtrlCreateLabel("&Source folder:", 33, 46, 76, 17)
$Input2 = GUICtrlCreateInput("", 137, 85, 361, 21)
GUICtrlSetTip(-1, "Specify the folder to be deleted and populated with modified INF files and existing driver files")
$Button2 = GUICtrlCreateButton("...", 512, 85, 36, 20, 0)
$Label2 = GUICtrlCreateLabel("&Destination folder:", 33, 85, 95, 17)
$Label3 = GUICtrlCreateLabel("Destination folder will be deleted and overwritten", 137, 111, 282, 17)
GUICtrlSetFont(-1, 8, 800, 0, "MS Sans Serif")
GUICtrlCreateGroup("", -99, -99, 1, 1)
$Button3 = GUICtrlCreateButton("Delete and &regenerate destination folder now!", 253, 150, 297, 20, 0)
$Label4 = GUICtrlCreateLabel("&ISOTITLE", 32, 152, 58, 17)
$Input3 = GUICtrlCreateInput("HFSLIPCD", 88, 152, 121, 21)
GUISetState(@SW_SHOW)
#EndRegion ### END Koda GUI section ###

If $CmdLine[0] > 0 Then
	GUICtrlSetData($Input1, $CmdLine[1])
EndIf

If $CmdLine[0] > 1 Then
	GUICtrlSetData($Input2, $CmdLine[2])
EndIf

While 1
	$nMsg = GUIGetMsg()
	Switch $nMsg
		Case $GUI_EVENT_CLOSE
			Exit
		Case $Button1
			$dir = FileSelectFolder("Source folder", "", 6)
			If @error = 0 Then
				GUICtrlSetData($Input1, $dir)
			EndIf
		Case $Button2
			$dir = FileSelectFolder("Destination folder (will be overwritten)", "", 7)
			If @error = 0 Then
				GUICtrlSetData($Input2, $dir)
			EndIf
		Case $Button3
			If StringStripWS(GUICtrlRead($Input3), 1) = "" Then
				GUICtrlSetData($Input3, "HFSLIPCD")
			EndIf
			$IsoTitle = GUICtrlRead($Input3)
			$CabLine = '1 = "' & $IsoTitle & '",driver.cab,,"\I386"'
			$SourceFolder = GUICtrlRead($Input1)
			$DestFolder = GUICtrlRead($Input2)
			If ValidateFolders($SourceFolder, $DestFolder) Then
				$StopProcessingFiles = False
				DirRemove($DestFolder, 1)
				ScanFolder($SourceFolder, $DestFolder)
				If Not $StopProcessingFiles Then
					LogToWindow('Completed reading "' & $SourceFolder & '" and writing "' & $DestFolder & '".')
					MsgBox(262144, "Drivercab Helper", "Drivercab Helper has finished.")
				EndIf
			EndIf
	EndSwitch
WEnd

Func ValidateFolders($SourceFolder, $DestFolder)

	If StringRight($SourceFolder, 1) = "\" Or StringRight($SourceFolder, 1) = "/" Then
		$SourceFolder = StringLeft($SourceFolder, StringLen($SourceFolder) - 1)
		GUICtrlSetData($Input1, $SourceFolder)
	EndIf

	If StringRight($DestFolder, 1) = "\" Or StringRight($SourceFolder, 1) = "/" Then
		$SourceFolder = StringLeft($DestFolder, StringLen($DestFolder) - 1)
		GUICtrlSetData($Input2, $DestFolder)
	EndIf

	If $SourceFolder = "" Then
		ShowError('Please specify a source folder.')
		Return False
	EndIf

	If _PathFull($SourceFolder) <> $SourceFolder Then
		ShowError('Source folder "' & $SourceFolder & '" cannot be a relative path.')
		Return False
	EndIf

	; Source folder must exist
	If DirGetSize($SourceFolder) = -1 Then
		ShowError('Source folder "' & $SourceFolder & '"does not exist.')
		Return False
	EndIf

	If $DestFolder = "" Then
		ShowError('Please specify a destination folder.')
		Return False
	EndIf

	If _PathFull($DestFolder) <> $DestFolder Then
		ShowError('Destination folder "' & $DestFolder & '" cannot be a relative path.')
		Return False
	EndIf

	If DirGetSize($DestFolder) <> -1 Then
		LogToWindow('Destination folder "' & $DestFolder & '"exists and will be overwritten.')
		; not an error
	EndIf

	If StringInStr($SourceFolder, $DestFolder) > 0 Then
		ShowError('Destination folder "' & $DestFolder & '" must not contain source folder "' & $SourceFolder & '".')
		Return False
	EndIf

	Return True
EndFunc   ;==>ValidateFolders

Func LogToWindow($s)
	GUICtrlSetData($Edit1, $s & @CRLF, 1)
EndFunc   ;==>LogToWindow

Func ModifyInfFile($SourceFileName, $DestFileName)

	Local $line, $comment, $in_sourcedisksfiles, $section_name_this_line
	Local $comment_pos, $equals_pos, $left_bracket_pos, $right_bracket_pos
	Local $BOMCheck

	$BOMCheck = _BOMCheck($SourceFileName)

	If $BOMCheck = -1 Then
		ShowError('Unable to determine if "' & $SourceFileName & '" is ANSI or Unicode.')
		Return False
	EndIf

	$SourceInfFile = FileOpen($SourceFileName, 0) ;Read

	If $SourceInfFile = -1 Then
		ShowError('Unable to open file "' & $SourceFileName & '" for reading.')
		Return False
	EndIf

	$DestInfFile = FileOpen($DestFileName, 10 + $BOMCheck)  ; Erase + Write + Unicode Mode

	If $DestInfFile = -1 Then
		ShowError('Unable to open file "' & $DestFileName & '" for writing.')
		FileClose($SourceInfFile)
		Return False
	EndIf

	LogToWindow('Reading ' & '"' & $SourceFileName & '",' & @CRLF & 'Writing "' & $DestFileName & '".')

	$in_sourcedisksfiles = False
	$line_num = 0
	While 1
		$line = FileReadLine($SourceInfFile)
		If @error = -1 Then ExitLoop
		$line_num = $line_num + 1
		$comment_pos = StringInStr($line, ";")
		If $comment_pos > 0 Then
			$comment = StringMid($line, $comment_pos, StringLen($line))
			$line = StringLeft($line, $comment_pos - 1)
		Else
			$comment = ""
		EndIf

		StringStripWS($line, 8)
		$left_bracket_pos = StringInStr($line, "[")
		If $left_bracket_pos > 0 Then
			$right_bracket_pos = StringInStr($line, "]")
			$section_name_this_line = StringMid($line, $left_bracket_pos, $right_bracket_pos)
			$section_name = $section_name_this_line;
			Switch StringUpper($section_name_this_line)
				Case "[SOURCEDISKSNAMES]", "[SOURCEDISKSNAMES.X86]", "[SOURCEDISKSNAMES.X64]"
					FileWriteLine($DestInfFile, $line & $comment)
					FileWriteLine($DestInfFile, $CabLine)
					LogToWindow($SourceFileName & '(' & $line_num & '): Inserted "' & $CabLine & '" in section "' & $section_name & '".')
					$in_sourcedisksfiles = False
				Case "[SOURCEDISKSFILES]"
					FileWriteLine($DestInfFile, $line & $comment)
					$in_sourcedisksfiles = True
				Case Else
					;Still a section header
					FileWriteLine($DestInfFile, $line & $comment)
					$in_sourcedisksfiles = False
			EndSwitch
		Else
			;Not a section header
			If $in_sourcedisksfiles = True Then
				$equals_pos = StringInStr($line, "=", 0, -1)
				If $equals_pos > 0 Then
					$old_line = $line
					$line = StringLeft($line, $equals_pos) & "1"
					LogToWindow($SourceFileName & '(' & $line_num & '):  Changed "' & $old_line & '" to "' & $line & '" in section "' & $section_name & '".')
				EndIf
			EndIf
			FileWriteLine($DestInfFile, $line & $comment)
		EndIf
		$section_name_this_line = "";
	WEnd

	FileClose($SourceInfFile)
	FileClose($DestInfFile)
	Return True
EndFunc   ;==>ModifyInfFile

Func ScanFolder($SourceFolder, $DestFolder)

	If $StopProcessingFiles Then
		Return
	EndIf

	Local $Search
	Local $File
	Local $FileAttributes
	Local $SourceFullFilePath
	Local $DestFullFilePath

	$Search = FileFindFirstFile($SourceFolder & "\*.*")

	While Not $StopProcessingFiles
		If $Search = -1 Then
			ExitLoop
		EndIf

		$File = FileFindNextFile($Search)
		If @error Then ExitLoop

		$SourceFullFilePath = $SourceFolder & "\" & $File
		$DestFullFilePath = $DestFolder & "\" & $File
		$FileAttributes = FileGetAttrib($SourceFullFilePath)

		If StringInStr($FileAttributes, "D") Then
			ScanFolder($SourceFullFilePath, $DestFullFilePath)
		Else
			LogFile($SourceFullFilePath, $DestFullFilePath)
		EndIf

	WEnd

	FileClose($Search)
EndFunc   ;==>ScanFolder

Func LogFile($SourceFileName, $DestFileName)
	If StringLen($SourceFileName) >= 4 And StringUpper(StringRight($SourceFileName, 4)) = ".INF" Then
		If Not ModifyInfFile($SourceFileName, $DestFileName) Then
			$StopProcessingFiles = True
		EndIf
	Else
		$FCStatus = FileCopy($SourceFileName, $DestFileName, 8)
		If $FCStatus = 1 Then
			LogToWindow('Copied "' & $SourceFileName & '" to "' & $DestFileName & '".')
		Else
			ShowError('Unable to copy "' & $SourceFileName & '" to "' & $DestFileName & '".')
			$StopProcessingFiles = True
		EndIf
	EndIf
EndFunc   ;==>LogFile

Func ShowError($ErrStr)
	MsgBox(16, "Error", $ErrStr)
EndFunc   ;==>ShowError

; Function Name:   _BOMCheck()
;
; Description:     Determines whether a given file is ANSI,
;                  UTF-16 Little Endian, UTF-16 Big Endian, or UTF-8
;
; Syntax:          _BOMCheck ( $filename )
;
; Parameter(s):    $filename   = The file to check

;
; Requirement(s):  Must be Unicode build of AutoIt v3.2.4.0 or later.
;
; Return Value(s): ANSI or Unsupported:    Returns 0
;                  UTF-16 Little Endian:   Returns 32
;                  UTF-16 Big Endian:      Returns 64
;                  UTF-8:                  Returns 128
;                  Problem opening file:   Returns -1
;                  The general idea with the return value is that it can
;                  be used in the calculation of the FileOpen mode

; Author:          David Eason

; Sample usage:
; While 1
;     Local $file = FileOpenDialog("Choose file", @DesktopDir & "\", "Text files (*.txt;*.inf)", 3)
;     if @error = 1 Then ExitLoop
;     MsgBox(0,"", _BOMCheck($file))
; Wend

Func _BOMCheck(ByRef $filename)

	; Supported Byte Order Markers for non-ANSI
	Local $BOMS[3]
	$BOMS[0] = Binary("0xFFFE")     ;UTF-16 Little Endian
	$BOMS[1] = Binary("0xFEFF")     ;UTF-16 Big Endian
	$BOMS[2] = Binary("0xEFBBBF")   ;UTF-8

	; Corresponding mode bit for FileOpen
	Local $FileModes[3]
	$FileModes[0] = 32
	$FileModes[1] = 64
	$FileModes[2] = 128

	Local $FH = FileOpen($filename, 4)

	If $FH = -1 Then
		Return -1
	EndIf
	Local $FirstBytes = FileRead($FH, 3)

	If @error = -1 Then
		FileClose($FH)
		Return -1
	EndIf

	Local $I
	For $I = 0 To 2
		If BinaryMid($FirstBytes, 1, BinaryLen($BOMS[$I])) = BinaryMid($BOMS[$I], 1, BinaryLen($BOMS[$I])) Then
			FileClose($FH)
			Return $FileModes[$I]
		EndIf
	Next

	; If still here, then it is presumed ANSI
	FileClose($FH)
	Return 0

EndFunc   ;==>_BOMCheck