; +++++++++++ This returns the square root of HL (rounded down). It is faster than division, interestingly: SqrtHL4: ;39 bytes ;Inputs: ; HL ;Outputs: ; BC is the remainder ; D is not changed ; E is the square root ; H is 0 ;Destroys: ; A ; L is a value of either {0,1,4,5} ; every bit except 0 and 2 are always zero ld bc,0800h ;3 10 ;10 ld e,c ;1 4 ;4 xor a ;1 4 ;4 SHL4Loop: ; ; add hl,hl ;1 11 ;88 rl c ;2 8 ;64 adc hl,hl ;2 15 ;120 rl c ;2 8 ;64 jr nc,SHL4skp1 ;2 7|12 ;96+3y ;y is the number of overflows. max is 2 set 0,l ;2 8 ;-- SHL4skp1: ld a,e ;1 4 ;32 add a,a ;1 4 ;32 ld e,a ;1 4 ;32 add a,a ;1 4 ;32 bit 0,l ;2 8 ;64 jr nz,SHL4skp2 ;2 7|12 ;144-6y sub c ;1 4 ;32 jr nc,SHL4skp3 ;2 7|12 ;96+15x ;number of bits in the result SHL4skp2: ld a,c ;1 4 ; sub e ;1 4 ; inc e ;1 4 ; sub e ;1 4 ; ld c,a ;1 4 ; SHL4skp3: djnz SHL4Loop ;2 13|8 ;99 bit 0,l ;2 8 ;8 ret z ;1 11|19 ;11+8z inc b ;1 ; ret ;1 ; ;1036+15x-3y+8z ;x is the number of set bits in the result ;y is the number of overflows (max is 2) ;z is 1 if 'b' is returned as 1 ;max is 1154 cycles ;min is 1032 cycles end