gdal/python: extracting projection info from hdf file

This is fairly straightforward if you think of the HDF dataset as a container, where each subdataset is a raster image with its own projection.

Your error is in not opening the subdataset, as GetSubDatasets only returns the strings you need to access them.

# open the HDF container
hdf_ds = gdal.Open(hdfFile)

# this is just a string of the name of the subdataset
b3_string = hdf_ds.GetSubDatasets()[4][0]

# open the subdataset
sds_b3 = gdal.Open(hdf_ds.GetSubDatasets()[4][0])

# get the projection
proj = sds_b3.GetProjection()

I suspect your issue is happening right in the beginning of the script, perhaps not getting at the datasets properly here:

ds = open(hdfFile)
sds_b3 = ds.GetSubDatasets()[4][0]
sds_b4 = ds.GetSubDatasets()[5][0]

...there doesn't seem to be anything wrong in your sytax for proj = ds.GetProjection().

I've been converting GeoTiffs to numpy arrays and back again, and this works for me:

import os, numpy, gdal

# Open the input raster and get some info about it
ds = gdal.Open(inRast)
b = ds.GetRasterBand(1) # Assuming there's only the one band
xSize = b.XSize
ySize = b.YSize
# Getting geotransform and projection information...
geoTrans = ds.GetGeoTransform()
wktProjection = ds.GetProjection() # Well-Known Text.

# Reading to numpy array...
bArr = gdal.Band.ReadAsArray(b)

# do stuff with the array now...

# Save the same-size-and-shape array (or another like it) back to GeoTiff
dst_filename = os.path.join(r"c:\my\save\path", "afilename.tif")
driver = gdal.GetDriverByName('GTiff')
# For a 32-bit integer, other formats possible too:
dataset = driver.Create(dst_filename, xSize, ySize, 1, gdal.GDT_Int32)
# Now we set the projection info from above:
# And we write the data:
oBand = dataset.GetRasterBand(1)
# oBand.SetNoDataValue(NoDataVal) # In case you want to set this

del dataset # Cleaning up and clearing memory.