Q. Zhuge, Z. Shao, B. Xiao, E. H. , and -. Sha, Design space minimization with timing and code size optimization for embedded DSP, Proceedings of the 1st IEEE/ACM/IFIP international conference on Hardware/software codesign & system synthesis , CODES+ISSS '03, pp.144-149, 2003.
DOI : 10.1145/944645.944685

Q. Zhuge, B. Xiao, Z. Shao, E. H. Sha, and C. Chantrapornchai, Optimal code size reduction for software-pipelined and unfolded loops, Proceedings of the 15th international symposium on System Synthesis , ISSS '02, pp.144-149, 2002.
DOI : 10.1145/581199.581232

T. W. , E. H. , and -. Sha, Combining extended retiming and unfolding for rate-optimal graph transformation, J. of VLSI Sign. Process, vol.39, issue.3, pp.273-293, 2005.

Q. Zhuge, C. Xue, Z. Shao, M. Liu, M. Qiu et al., Design optimization and space minimization considering timing and code size via retiming and unfolding, Microprocessors and Microsystems, vol.30, issue.4, pp.173-183, 2006.
DOI : 10.1016/j.micpro.2005.11.002

URL : http://www.cs.cityu.edu.hk/~jasonxue/papers/QF_Micro-Journal.pdf

C. Xue, Z. Shao, M. Liu, E. H. , and -. Sha, Iterational retiming, Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, CODES+ISSS '05, pp.309-314, 2005.
DOI : 10.1145/1084834.1084910

C. Xue, E. H. , and -. Sha, Maximize Parallelism Minimize Overhead for Nested Loops via Loop Striping, The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol.2, issue.4, pp.153-167, 2006.
DOI : 10.1007/BF01407876

URL : http://hdl.handle.net/10397/20824

N. L. Passos, E. H. , and -. Sha, Achieving full parallelism using multidimensional retiming, IEEE Transactions on Parallel and Distributed Systems, vol.7, issue.11, pp.1150-1163, 1996.
DOI : 10.1109/71.544356

URL : http://www.coins.nd.edu/~esha/papers/nelson/journal/TPDS96.ps

M. Sheliga, N. L. Passos, E. H. , and -. Sha, Fully parallel hardware/software codesign for multidimensional DSP applications, CODES, Pennsylvania (USA), pp.18-20, 1996.
DOI : 10.1109/hcs.1996.492222

URL : http://www.cse.nd.edu/~esha/papers/nelson/iwhsc96.ps

C. J. Xue, Z. Shao, M. Liu, M. K. Qiu, E. H. et al., Optimizing parallelism for nested loops with iterational and instructional retiming, J. Embed. Comput, vol.3, issue.1, pp.29-37, 2009.
DOI : 10.1007/11596356_19

URL : http://www.utdallas.edu/~cxx016000/papers/Jason/euc05_iterInst_xue_c4.pdf

Y. Elloumi, M. Akil, and M. H. Bedoui, Timing and code size optimization on achieving full parallelism in uniform nested loop, J. Comp, vol.3, issue.7, 2011.

Y. Elloumi, M. Akil, and M. H. Bedoui, Execution Time Optimization Using Delayed Multidimensional Retiming, 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications, pp.177-184, 2012.
DOI : 10.1109/DS-RT.2012.34

URL : https://hal.archives-ouvertes.fr/hal-01796770

L. Kaouane, M. Akil, T. Grandpierre, and Y. Sorel, A Methodology to Implement Real-Time Applications onto Reconfigurable Circuits, The Journal of Supercomputing, vol.30, issue.3, pp.283-301, 2004.
DOI : 10.1023/B:SUPE.0000045213.82276.8e

Y. Elloumi, M. Akil, and M. H. Bedoui, Achieving Minimal Cycle Period with Delayed Multidimensional Retiming, J. App. Soft Comp

T. O. Neil, S. Tongsima, E. H. , and -. Sha, Extended retiming: Optimal scheduling via a graph-theoretical approach, ICASSP, vol.4, pp.2001-2004, 1999.

K. K. Parhi and D. G. Messerschmitt, Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding, IEEE Transactions on Computers, vol.40, issue.2, pp.178-195, 1991.
DOI : 10.1109/12.73588

O. Lobachev, M. Guthe, and R. Loogen, Estimating parallel performance, Journal of Parallel and Distributed Computing, vol.73, issue.6, 2013.
DOI : 10.1016/j.jpdc.2013.01.011

G. Romanazzi, P. K. Jimack, and C. E. Goodyer, Reliable performance prediction for multigrid software on distributed memory systems, Advances in Engineering Software, vol.42, issue.5, pp.247-258, 2011.
DOI : 10.1016/j.advengsoft.2010.10.005

URL : http://www.scs.leeds.ac.uk/pkj/Papers/Journals/RJG10.pdf

Q. Zhuge, C. Xue, M. Qiu, J. Hu, E. H. et al., Timing optimization via nest-loop pipelining considering code size, Microprocessors and Microsystems, vol.32, issue.7, pp.351-363, 2008.
DOI : 10.1016/j.micpro.2008.02.002