Inverse lithography technique treats the mask design as an inverse mathematical problem that aims at synthesizing an input mask to deliver a desired output pattern on the wafer. Liu and Zakhor pioneered a mask design method based on the branch and bound algorithm and simulated annealing.^{12}^{,}^{13} But this is a time-consuming process. In order to reduce the computational complexity, iterative methods were proposed to solve the inverse problem via an optimization process,^{14}^{,}^{15} and they were further classified as linear, quadratic, and nonlinear optimization problems.^{16}^{,}^{17} Meanwhile, Poonawala and Milanfar designed the model-based optical proximity correction system and introduced the steepest descent algorithm for the optimization framework.^{18}^{,}^{19} Subsequently, the optimization framework was further generalized for phase-shifting masks^{20}^{,}^{21} and partially coherent imaging systems,^{22}^{–}^{26} and the optimization algorithm was improved with an active set method^{21} and with an augmented Lagrangian method.^{27} Most recently, Lv et al. further improved the computational efficiency of the mask synthesis by using the conjugate gradient and an optimal iterative step,^{28} and by introducing a mask filtering technique to enhance the regularity of the synthesized mask pattern.^{29}^{,}^{30} On the other hand, with ever-shrinking feature size, the printed feature becomes increasingly sensitive to the fluctuation of the fabrication process, which limits the yield in semiconductor industry. In order to synthesize masks that are robust to process variations, the average wafer performance is optimized via minimizing the expectation of the difference between the desired pattern and the output pattern with respect to process fluctuations.^{31}^{–}^{36} This approach takes into account process variations explicitly, and is well understood and easily accomplished. However, it further increases the computational complexity. It is noted that all these methods discretize mask as a raster image constituted by grids (pixels) and perform mask synthesis on a certain fine grid throughout the mask synthesizing process. Since the computational complexity of such iterative methods is $O[K\u2062log(K)]$,^{14}^{,}^{17}^{,}^{18} where $K$ is total grid numbers used in the discretization of mask, the mask grid size strongly impacts the computational cost. Naturally, one would like to employ a coarser grid to model mask and, thus, to cut down the runtime. Unfortunately, such straightforward strategy usually leads to inaccurate simulation results and a poorly performing mask that fails to optimally utilize the imaging capability of the lithography scanner. There has been a tradeoff between the runtime and mask quality.