My take on this is that if the test simulation reproduced similar results as that from empirical equation, then the model (with varying wind speed and tides) should compute the wind setup properly (corresponding to the input physical parameters).
The question could then be, what led to the 40 cm wind setup from the observation? did the setup account for storm surge instead (i.e.,were you comparing observed tide level against predicted tide level)? maybe the model bathymetry is incorrect (i.e., water depth too deep)...etc.... just a thought.